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Preface 


The  development  of  high-order  accurate  numerical  discretization  techniques 
for  irregular  domains  and  meshes  is  often  cited  as  one  of  the  remaining  chal¬ 
lenges  facing  the  field  of  computational  fluid  dynamics.  In  structural  me¬ 
chanics,  the  advantages  of  high-order  finite  element  approximation  are  widely 
recognized.  This  is  especially  true  when  high-order  element  approximation  is 
combined  with  element  refinement  ( h-p  refinement).  In  computational  fluid 
dynamics,  high-order  discretization  methods  are  infrequently  used  in  the  com¬ 
putation  of  compressible  fluid  flow.  The  hyperbolic  nature  of  the  governing 
equations  and  the  presence  of  solution  discontinuities  makes  high-order  ac¬ 
curacy  difficult  to  achieve.  Consequently,  second-order  accurate  methods  are 
still  predominately  used  in  industrial  applications  even  though  evidence  sug¬ 
gests  that  high-order  methods  may  offer  a  way  to  significantly  improve  the 
resolution  and  accuracy  for  these  calculations. 

To  address  this  important  topic,  a  special  course  was  jointly  organized  by 
the  Applied  Vehicle  Technology  Panel  of  NATO’s  Research  and  Technology 
Organization  (RTO),  the  von  Karman  Institute  for  Fluid  Dynamics,  and  the 
Numerical  Aerospace  Simulation  Division  at  the  NASA  Ames  Research  Cen¬ 
ter.  The  NATO  RTO  sponsored  course  entitled  “Higher  Order  Discretization 
Methods  in  Computational  Fluid  Dynamics”  was  held  September  14-18, 1998 
at  the  von  Karman  Institute  for  Fluid  Dynamics  in  Belgium  and  September 
21-25, 1998  at  the  NASA  Ames  Research  Center  in  the  United  States.  During 
this  special  course,  lecturers  from  Europe  and  the  United  States  gave  a  series 
of  comprehensive  lectures  on  advanced  topics  related  to  the  high-order  nu¬ 
merical  discretization  of  partial  differential  equations  with  primary  emphasis 
given  to  computational  fluid  dynamics  (CFD).  Additional  consideration  was 
given  to  topics  in  computational  physics  such  as  the  high-order  discretization 
of  the  Hamilton-Jacobi,  Helmholtz,  and  elasticity  equations. 

This  volume  consists  of  five  articles  prepared  by  the  special  course  lec¬ 
turers.  These  articles  should  be  of  particular  relevance  to  those  readers  with 
an  interest  in  numerical  discretization  techniques  which  generalize  to  very 
high-order  accuracy.  The  articles  of  Professors  Abgrall  and  Shu  consider  the 
mathematical  formulation  of  high-order  accurate  finite  volume  schemes  utiliz¬ 
ing  essentially  non-oscillatory  (ENO)  and  weighted  essentially  non-oscillatory 
(WENO)  reconstruction  together  with  upwind  flux  evaluation.  These  formu¬ 
lations  are  particularly  effective  in  computing  numerical  solutions  of  conser¬ 
vation  laws  containing  solution  discontinuities.  Careful  attention  is  given  by 
the  authors  to  implementational  issues  and  techniques  for  improving  the  over¬ 
all  efficiency  of  these  methods.  The  article  of  Professor  Cockburn  discusses 
the  discontinuous  Galerkin  finite  element  method.  This  method  naturally 
extends  to  high-order  accuracy  and  has  an  interpretation  as  a  finite  vol- 


VI  Preface 


ume  method.  Cockburn  addresses  two  important  issues  associated  with  the 
discontinuous  Galerkin  method:  controlling  spurious  extrema  near  solution 
discontinuities  via  “limiting”  and  the  extension  to  second  order  advective- 
diffusive  equations  (joint  work  with  Shu).  The  articles  of  Dr.  Henderson  and 
Professor  Schwab  consider  the  mathematical  formulation  and  implementa¬ 
tion  of  the  h-p  finite  element  methods  using  hierarchical  basis  functions  and 
adaptive  mesh  refinement.  These  methods  are  particularly  useful  in  comput¬ 
ing  high-order  accurate  solutions  containing  perturbative  layers  and  corner 
singularities.  Additional  flexibility  is  obtained  using  a  mortar  FEM  technique 
whereby  nonconforming  elements  are  interfaced  together.  Numerous  exam¬ 
ples  are  given  by  Henderson  applying  the  h-p  FEM  method  to  the  simulation 
of  turbulence  and  turbulence  transition. 

The  organizers  gratefully  acknowledge  the  special  course  lecturers  for  their 
substantial  effort  in  preparing  the  articles  for  publication.  The  organizers 
also  acknowledge  the  generous  support  of  NATO  RTO,  the  von  Karman 
Institute,  and  the  NASA  Ames  Research  Center  for  sponsoring  and  holding 
the  special  course.  Additional  thanks  is  also  given  to  RTO  and  Springer- 
Verlag  for  publishing  the  articles  in  a  quality  book  form  and  so  making  them 
available  to  a  wide  readership. 


Timothy  Barth  (NASA  Ames  Research  Center) 

Herman  Deconinck  (von  Karman  Institute  for  Fluid  Dynamics) 


March  1999 
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Abstract.  We  describe  in  detail  some  techniques  to  construct  high  order  MUSCL 
type  schemes  on  general  meshes  :  the  ENO  and  WENO  type  schemes.  Special 
attention  is  given  to  the  reconstruction  step.  Extesio  to  Hamilton  Jacobi  equations 
is  sketched.  We  also  present  some  hybrid  techniques  that  use  simple  modifications 
of  classical  TVD  schemes  yielding  in  a  very  clear  improvements  of  the  accuracy.  We 
discuss  means  of  improving  the  efficiency  using  Harten’s  multiresolution  analysis. 
We  provide  several  numerical  examples  and  comparisions  with  more  conventional 
schemes. 
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1  Introduction 

During  the  past  few  years,  a  growing  interest  has  emerged  for  constructing 
high  order  accurate  and  robust  schemes  for  simulations  of  compressible  fluid 
flow.  One  of  the  difficulties  is  the  appearance  of  strong  discontinuities  that 
may  interact  even  for  smooth  initial  data.  To  get  rid  of  this  difficulty,  a 
possible  solution  is  to  use  a  TVD  (Totally  Variation  Diminishing)  scheme. 
Such  a  scheme  has  the  property,  at  least  for  ID  scalar  equations,  not  to 
create  new  extrema,  and  hence  to  provide  a  nice  treatment  of  discontinuities. 
They  have  been  successfully  and  widely  used  with  any  type  of  meshes  (see  for 
example,  [48]  for  a  review  and,  among  many  others,  [25]  for  simulations  on 
finite  element  type  meshes).  They  are  now  of  common  use  even  for  industrial 
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simulations  of  flows  in  complex  geometries.  Nevertheless,  one  of  their  main 
weaknesses  is  that  the  order  of  accuracy  boils  down  to  first  order  in  regions 
of  discontinuity  and  at  extrema,  leading  to  excessive  numerical  dissipation. 

Various  methods  have  been  proposed  to  overcome  this  difficulty  (adap¬ 
tation  of  the  mesh,  for  example  see  [27,32])  but  one  promising  way  -  this  is 
not  the  only  possibility  as  we  show  in  this  paper  -  may  also  be  the  class  of 
Essentially  Non-Oscillatory  schemes  (ENO  for  short)  introduced  by  Harten, 
Osher  and  others  [18,19,34,15]  and  their  WENO  modifications. 

The  basic  idea  of  ENO  schemes  is  the  use  of  a  Lagrange  type  interpolation 
with  an  adapted  stencil:  when  a  discontinuity  is  detected,  the  procedure  looks 
for  a  region  around  this  discontinuity  where  the  function  is  smooth  and  least 
oscillatory.  This  reconstruction  technique  may  be  applied  either  to  the  nodal 
values  [34]  or  to  a  particular  function  constructed  from  cell  averages  in  control 
volumes  [18,19].  In  this  latter  case,  the  approximation  is  conservative.  This 
enables  one  to  approximate  any  piecewise  smooth  function  with  any  desired 
order  of  accuracy. 

One  of  the  purposes  of  this  paper  is  twofold:  first  to  provide  a  description 
of  the  basic  ideas  of  the  “classical”  ENO/WENO  methods,  and  second  to 
show  how  it  is  possible  to  adapt  them  to  general  geometries.  It  is  not  possi¬ 
ble  to  provide  a  complete  overview  of  ENO  methods.  Hence,  this  paper  will 
concentrate  on  finite  volume  ENO  methods  and  triangular  meshes.  Never¬ 
theless,  we  believe  that  it  will  provide  information  on  the  typical  features  of 
ENO  methods  on  general  geometries,  the  difficulties  and  problems  associated 
with  them. 

This  paper  is  organized  as  follows:  In  §2,  we  first  recall  the  principle  of 
finite  volume  schemes  of  MUSCL-type  [52].  This  enables  us  to  describe  the 
three  steps  of  a  finite  volume  scheme:  the  reconstruction,  the  evolution  and 
the  projection  step.  In  general,  the  last  two  steps  are  merged  by  means  of  a 
Riemann  solver  and  an  appropriate  temporal  discretization  scheme.  Before 
entering  the  main  topic  of  this  paper,  the  reconstruction  step,  we  describe 
Runge-Kutta  methods  due  to  Shu  and  Osher  that  allow  to  keep  the  TVD  or 
TVB  (Total  Variation  Boundedness)  features  of  the  first  order  approximation 
of  a  generalized  Riemann  problem  with  non-constant  data.  Then  we  move  to 
the  reconstruction  problem.  First,  in  §3,  we  detail  the  “classical”  methods 
that  are  applied  to  real  valued  functions  and  show  why  it  cannot  be  applied 
on  general  geometries. 

To  overcome  this  difficulty,  we  introduce  a  new  reconstruction  procedure 
in  §4.  We  show  its  properties  that  appear  to  be  the  same  than  those  of 
a  more  classical  Lagrange  reconstruction.  The  practical  calculation  of  this 
polynomial  is  discussed  in  detail  in  §5,  and  we  show  that  it  can  be  determined 
by  an  algorithm  very  similar  to  the  Newton  algorithm  for  divided  differences. 
We  also  discuss  the  impact  of  the  choice  of  the  polynomial  expansion  in  the 
calculation  from  the  point  of  view  of  stability.  The  ENO  reconstruction  is  then 
described  in  §6.  We  also  discuss  other  types  of  expansions  using  splines  in  §8. 
In  §11,  we  briefly  discuss  how  all  these  methods  can  be  extended  to  first  order 
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Hamilton  Jacobi  equations.  A  third  order  ENO  method  for  CFD  problems  is 
applied  in  §9  to  several  several  flow  problems,  and  we  show  that  the  accuracy 
is  improved  considerably  in  comparison  with  2nd  order  computations. 

However,  these  methods  are  by  nature  quite  costly.  This  is  why  we  also 
discuss  (§10)  a  particular  technique  aimed  at  reducing  the  CPU  cost  of  ENO 
methods.  Toward  the  same  goal  (getting  very  high  order  schemes  with  the 
lowest  computational  cost),  we  also  show  (§12)  how  to  modify  the  now  very 
classical  second  order  MUSCL  type  schemes  (on  cartesian  meshes)  so  that 
numerical  diffusion  is  reduced  a  lot. 


2  An  overview  of  finite  volume  schemes 


2.1  The  Euler  equations 


Let  us  quickly  recall  elementary  facts  about  the  Euler  equations  of  a  calori- 
cally  perfect  gas: 

3W  dF{W)  dG(W) 

dt  dx  dy 


= 0 


(2.1) 


As  usual,  in  equation  (2.1),  W  stands  for  the  vector  of  conserved  quantities 
and  F  (respectively  G )  is  the  flux  in  the  x  direction  (resp.  y  direction): 


W  = 


(  P  \ 
p  u 

p  V 


,F(W)  = 


(  pu  > 
p  u2  +  p 
p  uv 


\E  ) 

G(W)  = 


\u{E  +  p)J 

(  P  V  \ 

p  uv 

p  V2  +  p  ’ 


\v{E  +  p)J 


(2.2) 


with  initial  and  boundary  conditions.  In  equation  (2.2),  p  is  the  density,  u,v 
are  the  components  of  the  velocity,  E  is  the  total  energy  and  p  the  pressure, 
related  to  the  conserved  quantities  by  the  equation  of  state: 

P  =  (7  “  1)  (-E  -  \p(u2  +  v2)j  (2.3) 


The  ratio  of  specific  heats  7  is  kept  constant. 

It  is  well  known  that  the  system  defined  by  equations  (2.1),  (2.2)  and 
(2.3)  is  hyperbolic:  for  any  vector  n  =  (nx,ny),  the  matrix: 


.  dF  8G 

An~nxdw+Tlydw 


(2.4) 


is  diagonalizable  and  has  a  full  set  of  real  eigenvalues  and  eigenvectors.  Let 
us  describe  now  the  construction  of  a  high  order  scheme. 
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2.2  Finite  volume  formulation 

We  consider  a  mesh  M  consisting  of  control  volumes  {Ci},  for  example  the 
triangles  of  a  conforming  triangulation  or  the  boxes  of  a  dual  mesh,  see  [1]. 
The  semi  discrete  finite  volume  formulation  of  (2.1)  is: 

=  — ±-Jdc  T n[W(x,t)]dl  =  Ci(t )  (2.5) 

Here,  Wi(t)  is  the  (spatial)  mean  value  of  W(x,t)  at  time  t  over  Ci,  n  = 
(nx,Tiy)  is  the  outward  unit  normal  to  dCi,  and  Tn  =  nxF  +  nyG.  We  first 
describe  the  spatial  approximation  of  (2.5),  then  the  temporal  discretization 
of  the  resulting  set  of  ordinary  differential  equations.  Last,  we  give  details 
concerning  the  boundary  conditions. 


Spatial  discretization  The  first  step  is  to  discretize  £*(£)  up  to  kth  order. 
We  define  the  integer  number  p  such  that  either  k  =  2p  or  A;  =  2p  +  1.  We 
can  rewrite  \Ci\Ci(t)  as: 

f  Fn[W(x,t)]dl  =  Y,  [  Fn[W(x,t)]dl  (2.6) 

J  dCi  p*  J  r9 

where,  as  in  Figure  13,  the  set  of  the  Fs’sis  that  of  the  edges  of  Ci.  On  each  Ts , 
n  is  constant.  We  consider,  on  any  Ts,  the  p  Gaussian  points  {Gj}i<;<p  asso¬ 
ciated  to  the  Gaussian  formula  of  order  2p+l.  The  integral  fr  Fn[W ( x ,  t)]dl 
is  approximated  by 

(2.7) 

i=i 

where  the  term  Gn,i  (t)  has  to  be  defined.  Let  Cj  be  another  control  volume 
of  which  rs  is  a  boundary  part.  In  Ci  and  Cj,  compute  approximations  of  W 
at  time  t,  as  well  as  so-called  reconstruction  functions  (recovery  functions) 
I?i[W(.,f)]  and  Rj[W(.,t)].  The  ENO  reconstruction  described  in  this  paper 
(see  §4)  is  applied  to  the  physical  variables,  then  the  conserved  ones  are 
derived  from  them.  We  define 

Gn,i(t)  =  F£iemann  {i^[W(.,t)](G,),f?J[W(.,t)](Gt)}  •  (2.8) 

In  equation  (2.8),  Tj^'emann  may  be  any  of  the  available  Riemann  solvers.  In  all 
the  examples  below,  we  have  chosen  Roe’s  Riemann  solver  with  the  Harten- 
Hyman  entropy  correction.  The  boundary  conditions  are  implemented  as  in 

[25]. 

We  see  that  the  only  remaining  degree  of  freedom  is  the  evaluation  of 
Rj[W(.,  f)]  which  should  be  a  “good”  approximation  of  W.  It  is  natural  to  ask 
for  the  following  properties  [18,19]  for  the  reconstruction  i?[u]  of  a  function 
u : 
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PI-  If  u  is  of  class  Cr  with  r  >k,  then  the  / — th  order  derivative  of  u  -  R[u], 
l  <  k  satisfies  =  0(hk+1~l), 

P2-  TV(R[u ])  <  TV(u)  +  0(hr), 

P3-  The  average  of  fi[w]  over  [^i-1/2, *1+1/2]  is  equal  to  that  of  u. 

Roughly  speaking,  one  may  say  that  the  property  PI  guaranties  the  ac¬ 
curacy  of  the  approximation,  the  property  P2  guarantes  (for  reasonable  flux 
functions)  that  the  scheme  is  Total  Variation  Bounded  (TVB)  and  hence 
converges  to  the  correct  solution  for  any  t  £  [0,  T]  while  the  property  P3 
states  the  consistency  of  the  scheme.  Before  entering  into  the  details  of  the 
reconstruction  step,  let  us  briefly  comment  on  the  evolution  operator. 

Approximation  of  the  evolution  operator  A  classical  way  of  solving 
a  set  of  ordinary  differential  equations  like  (2.5)  is  to  use  a  Runge-Kutta 
scheme.  Among  these  schemes,  some  of  them  have  the  property  not  to  increase 
the  total  variation  [34].  They  are  built  as  follows.  The  set  of  equations  to  be 
solved  are  supposed  to  be  written  in  the  form 

a = £<">' 

where  the  operator  C  contains  spatial  derivatives. 

1.  Second  order  scheme:  This  is  the  classical  Heun’s  method.  It  is  TVD 
under  CFL=1. 

«d)  =  +  At£(u 

+  |ud)  +  ^(uW) 

2.  Third  order  scheme,  TVD  under  CFL=1  : 

u^1)  =  +  AtC(u (°)) 

u<2)  =  fu(°>  +  I  (uW  +  At£(uW)) 


u(V  =  +  |  (S(2)  +  AtC(i l(2) )) 

Until  the  end  of  this  paper,  we  will  not  give  more  details  on  the  time  stepping 
since  our  purpose  is  to  concentrate  on  the  reconstruction  step  of  a  finite 
volume  scheme. 

3  The  reconstruction  step  :  classical  methods 

3.1  An  essentially  non  oscillatory  Lagrange  interpolation 

The  classical  ENO  reconstruction  methods  are  derived  using  two  well  known 
properties  of  the  Lagrange  interpolation  of  a  function  u:  consider  an  increas¬ 
ingly  finer  subdivision  of  R,  (?/i),  and  P  a  polynomial  of  degree  r  such  that 
P(yi)  =  u(yi)  ,  m  <  l  <  m  +  r  Then 
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1.  if  u  admits  k  continuous  derivatives  on  [ym,ym+r\  then  for  any  x  £ 

\}hni  2/m+r]) 

\P(iHx)  -  u(fc)(a;)|  <  C'maxm<I<m+r_i|2/i  -  yi+1\r~k+1 

2.  if  the  fc-th  derivative  of  u,  k  <  r,  has  a  jump  [w^]  in  xq  £\ym.  ym+r[, 

then  ak,  the  leading  coefficient  of  P  in  its  Taylor  expansion  around  xo, 

behaves  like  maxm<;<m+r|y;  -  yt+ i|_r+fc. 

Although  the  first  property  stated  is  a  well  known  result  in  numerical 
analysis  the  second  property  is  (up  to  our  knowledge)  nowhere  clearly  stated 
and  proved  explicitly. 

These  two  facts  explain  that  if  u  is  smooth  then  the  coefficients  of  P 
remain  bounded  when  the  mesh  size  decreases  but  blow  up  if  one  of  the 
derivatives  of  u  of  order  smaller  than  the  degree  of  P  has  a  jump. 

From  these  two  remarks  one  may  construct  the  following  essentially  non- 
oscillatory  Lagrange  reconstruction  in  a  neighborhood  of  a  mesh  point  t/j. 
The  idea  is  to  construct  an  “adaptive”  stencil  Sk,  the  points  of  which  are 
used  to  compute  the  Lagrangian  interpolant  Pk,  where  Pk  is  a  polynomial 
of  degree  k. 

First,  we  recall  that  the  divided  differences  table  can  easily  be  recursively 
constructed: 

k  =0  :  [yi]u  =  u(yi) 


k  >  0  :  [yi,yi+i,  -  ■  ■  ,yi+k-i,yi+k]u  = 

[f/i+li  *  •  •  ;  Vi+k]u  ~  [yi,  Ui+l,  •  •  •  )  Ui+k-l]u 
Vi+k  ~  Vi 

We  can  now  describe  the  classical  ENO  interpolation  algorithm  of  Harten  [19], 
in  which  a  polynomial  is  constructed  which  does  not  interpolate  through  a 
discontinuity  of  u. 

ENO  interpolation  algorithm: 

1.  Start  with  Nf  =  {y,} 

2.  for  l  <  k,  consider  =  {yJO  <  . . .  <  yj0+i- 1}  as  the  stencil  for  Pl~l  ■ 
Compute  the  divided  differences  [yj0-i,yj0,'  •  ■  ,yj0+i~i]u  and 

[yjo  )  '  ’  ‘  )  Vjo+l-l  >  Vjo+l]U- 

-  if  \[yj0-i,yjor--,yh+i-i]u\  <  \[yj0,---,yj»+i-uyjo+i]u\  then  A = 

xriKvjo-ih, , 

-  else  ,  Af{  =  M\  U{%o+i+i }• 


end 

When  one  applies  this  algorithm  to  the  Heaviside  function  on  a  grid  with 
uniform  mesh  size  Ax,  one  gets 


R(x)  = 


/  1  if  *  <  ^ 

\  -1  if  *  >  4* 
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It  is  conjectured  in  [19]  that  if  the  function  u  is  smooth,  say  of  class  larger 
than  the  degree  A;  of  the  reconstruction,  then  the  derivative  of  u  -  R  satisfy 
max  |u^  —  <  Chk+1  where  C  is  a  constant  that  depends  on  the  mesh 

and  u,  and  TV(R)  <  TV(u)  +  0{hk). 


3.2  Application  to  finite  volume  schemes 

In  finite  volume  schemes,  the  variables  are  not  known  at  nodes  but  only 
their  mean  values  on  control  volumes  are  given.  There  are  two  ways  of  using 
the  above  ENO  Lagrange  interpolation  in  order  to  get  an  ENO  reconstruc¬ 
tion:  the  so-called  reconstruction  via  primitive  functions  and  reconstruction 
via  deconvolution.  Since  the  reconstruction  via  deconvolution  can  work  only 
for  regular  meshes  [19],  we  concentrate  on  the  reconstruction  via  primitive 
functions. 

Let  us  consider  a  mesh  (zijieN  C  ®  and  a  real  valued  function  u.  The 
averages  Ui  of  u  are  given  on  the  control  volumes  [xf_ j/2 ,  xi+l /2) .  It  is  possible 
to  know  the  values  of  the  primitive  W  of  u  defined  by 

u(t)dt 

1/2 

at  the  nodes  Xi+ 1/2,  because 

fx~\/2 

W(xi)  =  /  u(t)dt  =  2_^(xj+i/2  ~  Xj-i/i)ui. 

j—Q 

The  choice  i  =  0  in  the  definition  of  W  is  arbitrary.  One  now  determines  a 
local  ENO  Lagrange  interpolant  of  W  up  to  degree  k  +  1,  say  Pk+1,  on  the 
interval  ]xj_i/2 ,  ^i+1/2  [•  This  is  obtained  by  constructing  the  ENO  interpolant 
using  the  above  ENO  interpolation  algorithm.  The  nodal  values  W{xi+i/2) 
are  taken  as  interpolation  data.  The  interpolant  is  finally  restricted  to  the 
intervall  ]xj_ i/2,Zt+i/2[,  i  e. 

dPk+l 

R[v]  =  j  on  ]£j_i/2,£i+i/2[' 

It  is  clear  that  if  u  is  smooth  enough,  then  property  PI  is  satisfied.  If  u 
has  only  isolated  discontinuities  itself  or  in  one  of  its  derivatives,  property 
P2  is  also  satisfied,  see  [1].  The  last  property  P3  is  a  consequence  of  the 
construction: 


IT;;:  RMdx = ^+1to+i/2)  -  p?+1(xi- 1/2)  = 

(•^i+l/2  —  xi—l/‘l)ui 
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3.3  Possible  extensions  to  higher  dimensions  and  their 
weaknesses 

The  extension  of  these  method  to  higher  dimensions  has  been  carried  out, 
for  example  by  Casper  et  al.  [10]  for  regular  structured  grids.  The  basics  of 
their  method  are  the  following.  They  first  assume  their  mesh  is  a  Cartesian 
product  {(xi,yj),  1  <  i  <  N,  1  <  j  <  M}  and  they  take  rectangular  control 
volumes.  With  the  notation  Ak^  =  6c+ 1  -  6c  >  the  data  are 

2  /•».•+ 1/2  rVj+ 1/2 

Wij  =  A  ~  A  ~  /  w(x,y)dx  dy.  (3.1) 

Atx  Ajy  JXi_lf3  Jyj_1/2 

For  Vj-i/2  <  V  <  Vj+ 1/2>  they  consider  the  primitive  function  Wj(x)  associ¬ 
ated  with  w  defined  by: 

d£  (3.2) 

From  (3.1),  they  notice  that 


W 


i(x)  =  T  J- 
Jx o  Ajy 


r 


Vj+l/2 


J Hi -1/2 


w(£,y)dy 


AiXWij  =  Wj(xi+1/2)  -  Wjfa- 1/2) 


so  that  they  can  consider  the  reconstruction  via  primitive  function  of  degree 
k  of  Wj :  Vj(x )  =  R(x,w)j. 

Then,  the  procedure  (3.2)  is  performed  for  any  j.  Since 

d  1  fVi+l/3 

— Wj(x)  =  —  /  w(x,y)dy , 

dx  Ajy  Jy._  1/2 

R(x,  w)j  can  be  interpreted  as  a  one-dimensional  cell  average  on  [j/j-1/2,  yj+1/2] 
of  some  function  v(x,y).  For  a  fixed  x,  one  considers  the  set  {R(x,w)j}  and 
a  primitive  V(x,y)  associated  to  v, 

V(x,y)  =  f  v{x,y)dxdy, 

Jyo 

whose  pointwise  values  are  known  at  the  interfaces: 

j 

V(x,yj+ 1/2)  =  ^2  Akyvk{x). 

k=jo 

We  can  once  more  apply  the  same  reconstruction  via  primitive  function  to 
v  of  degree  k  and  construct  a  reconstruction  R2(x,y,w )  =  R(y,  R(x;w))  of 
w  .  It  is  clear  that  this  new  reconstruction  will  have  the  conservation  prop¬ 
erty,  essentially  non-oscillatory  and  precision  properties:  they  are  directly 
inherited  from  the  one-dimensional  reconstruction  properties. 
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When  the  mesh  consists  of  quadrilaterals  that  are  not  a  Cartesian  product, 
one  has  to  assume  the  existence  of  a  smooth  transformation  from  the  physical 
x  —  y  plane  the  rectangular  £  —  v  plane:  x  =  x(£,v),  y  =  y(£,  v)  The  Jaco¬ 
bian  determinant  «/(£,  v)  should  never  vanish.  The  control  volume  Cy  of  the 
physical  plane  are  the  control  volumes  £>y  =]&-i/2,£m/2[x]I/i-i/2>*'i-i-i/2[ 
mapped  by  the  transformation.  The  averaged  values  are: 

l  rt i+1/2  /■*',+ 1/2 

u=—  /  u{x(t,v),y(Z,v))d£  dv 

■'&-1/2  •'''i-1/2 

where  ay  is  the  area  of  Cy.  Then  one  uses  the  above  reconstruction  on  the 
£  -  v  mesh.  The  reconstruction  R  is: 

R(Z,v,u)  =  jfT  v,au) 

The  scaling  factors  are  introduced  so  that  R  satisfies  the  conservation  prop¬ 
erty. 

Prom  this,  it  is  clear  that  this  kind  of  reconstruction  algorithm  is  very 
dependent  on  the  structure  of  the  mesh.  For  example,  for  the  reconstruction 
via  primitive  function,  one  needs  to  gather  control  volumes  into  subsets  so 
that  their  collection  is  a  square.  For  a  reconstruction  of  degree  k,  one  should 
be  able  to  gather  them  into  subsets  containing  k 2  control  volumes.  This  is  in 
general  not  possible,  see  Figure  13. 

For  all  these  reasons,  one  needs  other  algorithms  to  handle  more  general 
geometries.  In  Section  4,  we  show  how  this  can  be  done  in  the  context  of 
general  unstructured  meshes. 

4  The  reconstruction  problem  on  unstructured  meshes 

4.1  Preliminaries 

In  the  sequel,  the  symbol  IIn[x,y]  denotes  the  set  of  polynomials  P  in  the 
variables  x  and  y  of  degree  less  or  equal  than  n: 

n 

p(x,y)  =  ^2^2  aijx'y3  t4-1) 

(=0  i+j=l 

The  set  FIn[x,  y]  is  a  vector  space  of  dimension  N(n)  —  (n+1Kn+2) ;  a  basis  of 
which  is  the  set  of  monomials  {(a:  -  x0)l(y  -  yo)J}i+J<n  where  (x0,yo)  is  any 
point  in  R2 .  The  degree  of  P  does  not  depend  on  the  choice  of  (#o>  Vo)-  As  we 
will  show  later,  this  kind  of  basis  is  not  well  suited  for  practical  calculations. 

Let  a  set  of  points  be  given.  Associated  with  this  set  we  also  consider 
a  triangulation  T.  We  may  consider  several  kinds  of  control  volumes,  for 
example  the  triangles  of  T  themselves  or  the  dual  mesh.  The  dual  mesh  with 
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its  control  volumes  is  constructed  as  follows:  For  each  point  Mi  the  control 
volume  is  obtained  by  connecting  the  midpoints  of  the  segments  adjacent 
to  it  and  the  center  of  gravity  of  the  triangles  of  which  it  is  a  vertex.  Let 
us  denote  by  {Ci}  the  set  of  control  volume.  We  only  require  the  following 
properties: 

-  For  any  i  ^  j,  Ci  f]  Cj  is  of  empty  interior, 

-  Ci  is  connected, 

-  There  is  an  algebraic  dependency  of  the  Ci  s  in  terms  of  the  points  of  M, 
i.e.  the  points  of  M  are  within  a  specific  location  inside  the  boxes,  or  the 
node  points  of  the  triangles,  respectively. 

-  The  boundary  of  Ci  is  a  polygonal  line  with  at  most  N0  vertices. 

We  consider  the  following  problem  (problem  V  or  approximation  in  the  mean 
for  short): 

Let  u  be  regular  enough  (say  in  L1 ).  Given  two  integers  N  and  n,  a 
set  of  control  volumes  S  =  {Ci,}i<i<N ,  find  an  element  P  €  l„[a:, y] 
such  that  for  1  <  l  <  N, 

I  u  dx 

ni  :=  (  A  (QM  :=  =  (  A  (CU),P).  (4.2) 

For  that  problem  to  have  a  unique  solution,  one  must  satisfy  two  condi¬ 
tions:  N  =  (n+1-Kn+2)  =  7V(n)  and  the  following  Vandermonde  type  matrix 
must  be  non  singular 

V  =  «  A  (C, ),*V»iw<„, (4-3) 

If  det  V  0,  then  we  will  say  that  this  stencil  S  is  admissible.  In  that 
case,  there  is  a  unique  solution  to  problem  (4.2). 

A  similar  problem  was  first  considered  by  Barth  et  al.  [8]  for  smooth 
functions,  then  by  Harten  et  al.  [17],  Vankeirsbilck  et  al.  [39,38],  Abgrall  [1] 
and  Sonar  [20].  In  the  four  first  references  [8,17,39,38],  the  authors  consider 
overdetermined  systems  for  two  reasons:  first,  the  problem  V  has  not  always 
have  a  unique  solution,  second  they  claim  that  the  condition  number  of  the 
overdetermined  system  is  better  than  that  of  problem  V.  In  [1],  the  same 
approach  as  here  was  adopted.  To  support  this  choice,  we  note  that  (4.3) 
is  generally  not  singular.  Second,  the  condition  number  of  the  linear  system 
mainly  depends  on  the  basis  used  for  the  polynomial  expansion,  as  it  is  shown 
in  Section  4.4.  For  these  two  reasons,  we  have  prefered  this  approach  which 
also  has  the  advantage  of  simplifying  the  coding  of  the  global  scheme. 


4.2  Some  general  results  about  problem  V 

In  this  section,  we  give  two  results  on  the  reconstruction  (4.2)  of  a  given 
function  u  if  either  it  is  smooth  or  not.  They  generalize  well-known  properties 
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of  the  Lagrange  interpolation  of  ID  real-valued  functions  that  have  been  used 
as  a  building  block  by  Harten  and  his  coauthors  to  design  an  essentially  non- 
oscillatory  reconstruction.  Throughout  this  section,  if  <Sn  is  an  admissible 
stencil  for  degree  n,  the  symbol  K  (Sn)  denotes  the  convex  hull  of  the  union 
of  the  elements  of  Sn. 


The  case  of  a  smooth  function  In  [1],  we  show  the  following  result.  Its 
proof  follows  easily  from  Ciarlet  and  Raviart’s  proof  [11]  on  Lagrange  and 
Hermite  interpolation: 

Theorem  1.  Let  S  be  an  admissible  (for  degree  n)  stencil  of  R2,  let  h  and 
p  be  the  diameter  of  K(S)  and  the  supremum  of  the  diameters  of  all  circles 
contained  in  K(S),  respectively.  Let  u  be  a  function  that  has  everywhere  in 
K(S )  a  derivative  Dn+1u  with 

Mn+ 1  =  sup{||Dn+1u(x)||;a;  G  RT(<S)}  <  +oo. 

If  Pu  is  the  solution  to  problem  V,  then  for  any  integer  m,  0  <  m  <  n, 

hn+1 

sup{\\Dmu(x)  -  DmPu(x)\\]  x  G  K (S)}  <  CMn+1^-~ 

pin 

for  some  constant  C  =  C(m,n,S).  Moreover,  if  S'  is  obtained  from  S  by  an 
affine  transformation  (i.e.  there  exists  Xq  G  R2  and  an  invertible  matrix  A 
such  that 

C"k  G  S'  iff  there  exists  Ck  G  5  such  that  C'k  =  A  Ck  +  x0, 


then 

C(n,m,S)  =  C(n,m,S'). 

This  result  basically  expresses  that  if  the  stencil  S  is  not  too  flat,  i.e.  the 
ratio  h/ p  is  not  too  large,  then  Pu  will  be  a  good  approximation  of  u.  Let  us 
turn  now  to  the  case  of  unsmooth  functions. 


4.3  The  case  of  a  nonsmooth  function 

We  begin  with  some  notations.  Let  Sn  be  an  admissible  stencil.  For  the  sake 
of  simplicity,  we  assume  that  (xo,  yo)  is  any  point  in  K(Sn).  Throughout  this 
subsection,  we  adopt  the  lexicographic  ordering  for  polynomials:  if  i  and  j 
are  two  indices  such  that  i  +  j  =  p  <  n,  we  set  l  =  N(p  —  1)  +  i  +  1  (with  the 
convention  N(— 1)  =  0)  and  denote  by  Pi  the  monomial  (x  —  xo )l(y  -  yoY ■ 
We  also  set 


Ri  =  ((  A  (CO,  Pi) ,  ■  ■  ■ ,  (  A  (CN{n)), Pt))T . 
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Given  a  set  of  N(n)  real  numbers  Uj,  the  solution  P  =  ajPj  of  problem 

(4.2)  may  be  seen  as  the  solution  of  the  linear  system  Mn  (ai,  ■  •  • ,  ayy(n))T  =: 

U  =  («i,  •  •  • , u.v(n))T)  where  the  Zth  column  of  Mn  is  denoted  by  R\. 

The  aim  of  this  section  is  to  give  the  asymptotic  behavior  of  the  leading 
coefficients  of  Pi.  By  scaling  arguments,  we  see  that  if  the  data  m  are  the 
average  of  some  u  that  is  smooth  up  to  order  p  <n,  one  should  have 

M  -  hp~n, 

where  h  is  the  scaling  factor  (a  typical  size  of  the  cells).  Unfortunately,  this 
kind  of  argument  assumes  that  we  work  with  a  very  particular  set  of  stencils: 
all  stencils  are  obtained  by  a  similarity  transformation  from  a  “mother”  sten¬ 
cil.  This  is  far  from  what  we  need,  namely  an  estimate  involving  the  typical 
size  of  admissible  stencil  of  size  h  small.  Moreover,  it  is  possible  that  for  some 
stencils 

E  N  =  o. 

N  (n—l)+l<l<N(n) 

This  means  that  P  is  of  degree  n  -  1  at  most.  With  this  kind  of  stencil,  no 
information  at  all  is  obtained  from  the  leading  coefficients. 

We  need  to  work  with  admissible  stencils  such  that  the  polynomial  P, 
solution  of  problem  (4.2)  (for  degree  n )  with  data  that  are  either  0  or  taken 
among  linear  combination  of  terms  belonging  to  {(  A  (C),  Pi)}ceSn  is  exactly 
of  degree  n.  More  precisely,  we  define  the  set  C  ffiM  for  a,  P  >  0, 
p  £  (1,  •  •  -n}  and  M  =  ( N(p )  -N(p-  1))  x  N0  x  N(n)  by 

Definition  2.  Given  any  p  £  {1,  •  ■  -n},  and  two  real  number  a  >  0,  (3  >  0, 
we  define  the  set  as  follows:  Sn  £  Vn’p  if  an(i  only  if 

1.  The  diameter  h  of  K,  the  convex  hull  of  <Sn,  is  1  and  (0, 0 )  £  K  ; 

2.  The  stencil  Sn  is  admissible  and 
|det(i?i ,  -  •  •  ,i?iV(n))|  >  P 

3.  For  any  polynomial  Q  of  degree  exactly  p,  Q  =  Xw=o,P  5Zi+i=;  ^a(x  ~ 
x0 Y(y  —  yoV  with  max;ii+3=i  |Ay|  =  1,  and  for  any  partition  .So,  «Si  of 
<Sn,  with  #So  >  0  and  #<Si  >  0,  the  polynomial  P  £  P„(K2)  defined  by 
(  A  (Ck),P)  =  0  if  Ck  £  So  and  (  A  (Ck),P)  =  (  A  (Ck),Q)  if  Ck  £  Si 

satisfies  £i+j=n  |oij|  >  a. 

In  the  above  definition,  the  polynomial  Q  cannot  identically  vanish.  The 
polynomial  P  is  of  degree  exactly  n  and  its  leading  coefficients  cannot  be  very 
small.  It  can  be  shown  the  inequality  |det(JZi,  •  ■  ■ , -R/v(n))|  >  P  implies  that 
the  stencils  are  not  too  flat,  i.e.  the  ratio  j  =  ^  is  not  too  large.  Algebraic 
arguments  indicate  that  for  any  n  and  p,  we  can  find  a,j3>  0  such  that  the 
set  Pn’p  is  n°f  empty. 

To  motivate  Definition  2,  we  give  a  counterexample  in  ffi;  counterexamples 
in  higher  dimensions  can  also  be  obtained  [1].  Consider  the  stencil  {xo  = 
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0,  x\  =  1,X2  =  2},  and  P,  polynomial  of  degree  two,  such  that  P(0)  =  1, 
P(l)  =  0,  P(2)  =  0.  We  then  have  P( 3)  =  1.  But  the  stencil  {zo  =  0,  x\  = 
l,X2  =  2,  x3  =  3}  is  admissible  for  degree  three,  and  hence  does  not  satisfy 
the  analogue  of  Definition  2  in  R. 

We  have  the  following  result. 

Theorem  3.  Let  n,  p,  be  integers  and  a,  f3  and  <5  real  numbers  be  integers 
and  real  numbers,  as  in  Definition  2  and  Sn  be  an  admissible  stencil  such  that 
there  exists  an  affine  invertible  transformation  <p  for  which  <f>~1(Sn)  € 
and  H^-1 1|  <  S.  Let  u  be  a  real-valued  function  defined  on  an  open  subset 
of  fl  in  E2 ,  u  €  Cp,p  <  n,  except  on  a  locally  C1  simple  curve  C.  The  in¬ 
tersection  of  the  convex  hull  K  of  Sn  and  C  is  assumed  to  be  nonempty. 
We  also  assume  that  the  pth-order  derivative  of  u  has  a  jump  such  that 
m^n(*,!/)ec  1 1  [Dpu] (a:,  i/)  1 1  >  7  >  0. 

Then  there  exists  a  constant  C(n,p,a,j3,S)  >  0,  invariant  by  affine  trans¬ 
formations,  such  that  the  coefficients  in  the  Taylor  expansion  ofu  around  any 
point  (xo,yo)  satisfy 

Y^\aij\>C(n,p,a,/3,5)j^-  (4.4) 

i+j=n 


if  h  is  small  enough. 

This  result  enables  us  to  distinguish  between  regions  of  smoothness  and 
those  where  a  jump  in  one  of  the  derivatives  occurs. 


4.4  Three  polynomial  expansions 

In  this  section,  we  intend  to  study  the  numerical  system  that  has  to  be  solved 
in  order  to  get  P  from  the  data.  We  will  consider  three  kinds  of  expansions 
of  P: 

1.  the  ’’natural”  expansion:  for  any  point  (xo,2/o)  €  E2 , 

p=  aij(.x-xoy(y-yoy,  (4.5) 

0 


2.  an  expansion  using  scaling.  Define  a  local  scaling  factor  s  :=  1/^J\C\\ 
which  should  be  read  as  an  approximation  for  \jh  and  change  the  ’’na¬ 
tural”  expansion  into 

p=  XI  aijSl+3{x-xo)l(y-yoy,  (4.6) 

0<i+j<n 

3.  an  expansion  using  barycentric  coordinates.  Let  Sn  =  {Ci,C2,C3, . . . , 
C N(n) }  be  an  admissible  stencil.  Hence,  at  least  one  subset  of  three  ele¬ 
ments  of  <Sn  is  an  admissible  stencil  for  n  =  1.  We  may  assume  that  the 
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set  {Ci,  C2,  C3}  is  admissible.  We  consider  the  three  polynomials  Ai  of 
degree  1  defined  by  A{Cj)Ai  =^,l<i<3,l<j<3.  Clearly,  we  have 
Ai  +  At  +  A3  =  1.  These  polynomials  are  the  barycentric  coordinates  of 
the  triangle  constructed  on  the  barycenters  of  Ci,  C2,  and  C3.  In  order 
to  get  expansion  (4.5),  a  strategy  may  be  to  look  first  for  the  expansion 
of  the  polynomial  P  in  terms  of  powers  of  Ai  and  A3: 

p=  £  (4.7) 

i+j<n 

and  then  to  get  the  Taylor  expansion  of  P  around  the  center  of  gravity 
of  Ci  from  (4.7)  (the  Theorems  1  and  3  give  the  behaviour  of  the  leading 
coefficients  of  P  whatever  point  chosen  in  the  convex  hull  of  S). 

In  order  to  get  the  expansions  (4.5),  (4.6)  or  (4.7),  one  has  to  solve  linear 
N(n)  x  N(n)  systems: 

B(a00r-- ,a0n)T  =  ((AiQ^u)  r-- ,(A(CiN(J,u))T  (4.8) 

where  the  matrix  B  is  obtained  by  taking  the  average  of  (a:  —  xo)l(y  —  t/o)J  for 
(4.5),  the  same  average  times  s1+i  for  (4.6)  and  A\A^  for  (4.7).  Let  us  now 
study  the  properties  of  these  linear  systems. 


The  case  of  the  natural  expansion  A  very  easy  consequence  of  the  in¬ 
equality  (4.4)  is: 

Proposition  4.  Let  us  assume  that  the  conditions  of  Theorem  3  hold  and  let 
h  be  the  supremum  of  the  diameters  of  the  spheres  containing  K{Sn).  Then 
the  condition  number  of  system  (4.8)  is  at  least  0(h~n)  for  h  small  enough. 

This  fact  is  well  known  for  ID  Lagrange  interpolation  and  has  motivated 
the  search  of  more  efficient  algorithms,  such  as  the  Newton  algorithm.  There 
exist  algorithms  that  generalizes  it  [31,30].  In  §5,  we  propose  a  completely 
algebraic  algorithm  that  we  show,  to  be  equivalent  to  the  generalization  of 
[31,30]  for  the  cell  average  recovery  problem  (4.2).  These  method  makes  use 
of  the  barycentric  coordinate  expansion  (4.7). 

The  case  of  the  barycentric  expansion  In  the  case  of  expansion  (4.7), 
we  have  the  following  result: 

Proposition  5.  Under  the  assumption  of  Theorem  3  the  condition  number 
of  the  system  (4.8)  for  the  expansion  (4.7)  is  bounded  from  above  and  below 
by  constants  independent  of  h,  the  supremum  of  the  diameters  of  the  circles 
containing  K(Sn). 

For  this  reason,  the  barycentric  expansion  is  more  suitable  in  practical 
calculations. 
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The  case  of  the  scaled  expansion  In  [21]  it  is  shown: 

Proposition  6.  The  condition  number  of  system  (4.8)  for  expansion  (4-6) 
is  invariant  with  respect  to  isotropic  grid  scaling.  In  particular  it  does  not 
depend  from  h. 

This  means  that  system  (4.8)  can  be  solved  stably  even  on  fine  grids. 
Which  is  a  requirement  for  practical  computations. 

The  scaling  technique  is  simpler  than  using  the  barycentric  expansion  but 
for  unisotropically  scaled  grids  it  is  not  clear  whether  it  is  sufficient  or  not. 


5  The  explicit  calculation  of  the  reconstruction: 
Miihlbach  expansions,  Tschebyscheff  systems  and 
divided  differences 


In  this  section,  we  sketch  the  main  results  of  [6]  and  provide  the  link  between 
the  previous  section  and  divided  differences.  In  fact,  when  computing  the 
reconstruction  polynomial  by  the  reconstruction-via-primitive  technique,  it 
is  surprising  to  see  that  the  formulae  look  very  similar  to  divided  differences 
formula,  even  though  an  additional  derivative  has  been  taken.  In  this  section 
we  explain  this  fact  and  show  that  the  algorithm  is  indeed  the  same  as  the 
one  based  on  divided  differences. 

We  begin  with  a  definition  and  stay  as  close  as  possible  to  the  notations 
introduced  by  Miihlbach  in  [31].  The  functional  space  V  is  in  practice  the 
space  of  continuous  functions  on  f?  or  L1(/?). 

Definition  7.  The  functions  (pi , . . . ,  ipn  €  V  form  an  I-Tschebyscheff  sys¬ 
tem  on  f2,  if  the  condition 


V 


( <Pu 

Ui. 


,  VrA 

i  / 


(Aa,v?i)  •••  (Ai ,<p„) 

(An,  <^l)  ■  ■ '  (A  715  Tn) 


7^0 


holds  for  the  set  of  linear  forms  (information)  I  =  (Ai, . . . ,  \n)T  ■ 

<pi,  . . . ,  (pn 

Al ,  ■  •  ■ ,  An 

minant. 

If  {xi , . . .  ,xn}  denotes  a  set  of  n  distinct  points  in  Q  and  A*  =  <S£.  then  we 
are  back  at  the  classical  interpolation  condition.  In  the  type  of  applications  we 
are  most  interested  in,  {Ci, . . . ,  Cn}  denotes  a  set  of  pairwise  disjoint  control 
volumes  and  I  is  the  information  about  cell  averages  (A *,#)  :=  A  (C))#  of 
#  €  V.  In  §4.4,  the  functions  fa ,  •  •  • ,  (j>n  where  either  of  the  type  (x  —  xo )l{y  — 
2/o y  for  (4.5)  or  A\A\  for  (4.7).  The  Van  der  Monde  condition  has  already 
been  introduced  in  (4.3).  However,  the  results  we  present  here  can  be  applied 
to  more  general  problems. 

The  simple  rules  on  linear  systems  enable  us  to  get  the  following 


We  refer  to  V 


^  as  the  generalized  Van  der  Monde  deter- 
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Lemma  8.  The  following  three  statements  are  equivalent. 

1.  ipi  ,...,tpn  constitute  an  I-Tschebyscheff  system  on  Q. 

2.  For  every  function  $  \  fl  R  and  I  =  (Ai, . . . ,  An)T,  there  exists  a 
unique  linear  combination 


pn$  :=  p$ 


<Pi,  ■ 
Ai,  • 


•  >  Fn 

■  ,  A  n 


n 

:=  Y^aiTi 

i=  1 


of  (pi , . . . ,  ipn  satisfying  the  recovery  conditions 


(\i,pn$)  =  i  =  l,...,n.  (5.1) 

Note  that  the  conditions  (5.1)  are,  in  the  case  of  cell  averages,  the  conditions 
(4.2).  We  are  now  ready  to  define  the  generalized  divided  differences. 

Definition  9.  The  coefficients  a*  in  the  representation 


n 

Pn$  =  'YjOiWi, 

i—1 


i.e.  the  coefficients  corresponding  to  the  ipi’s,  are  called  generalized  divided 
differences  of  <P  with  respect  to  the  I-Tschebyscheff  system  , . . . ,  tpn  and 
will  be  denoted  by 


The  function 


rn#  :=  rT> 


Vi, 

Ai, 


<P1, 

•••>¥>» 

.Ai, 

. . . ,  An 

i 

)  tyn 

> 

:=#- 

pF 

•  i  Tn 


is  called  the  recovery  error  function. 


Lemma  10.  With  the  notation  of  Definition  7  the  representation 


Tn 
A  n 


<Pi,  ■  ■ 

•  j  Tn 

& 

Ai,  . . 

•  j  ^ n 

i 

Vl,  ■■■,  <Pk-l>  Fk+ 1, 

Ai,  . . . ,  Afc_x ,  A*,,  Afc+i, 


<Pi, 


i  * Pn 


Ai ,  .  .  .  ,  An 


for  all  k  =  1, . . . ,  n  as  well  as 


rn$(x) 


v(\ " . f’f) 

y  Ai,  •  •  •  3  An5  ®t_) 

\  Ai,  •  •  • ,  An  J 


hold. 
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We  now  generalize  Miihlbach’s  Newton-type  interpolation  formula  constructed 
in  [31]  to  the  case  of  the  recovery  problem. 

Theorem  11.  Let  m  <  n  be  natural  numbers.  Suppose  that  ipx,...,tpn  con¬ 
stitute  an  I-Tschebyscheff  system  with  I  =  (Ai , . . . ,  An)T  such  that  its  subsys¬ 
tem  (pi, ,  (pm  is  again  an  I- Tschebyscheff  system  with  respect  to  Xi, ... ,  Am . 
Then,  for  every  function  £  V  and  every  there  holds  the  Miihlbach 

expansion 


En 

k=m+ 1 


(^1 ,  . . . ,  ipn 

_  >  •  •  •  i  Xji  ^ 

(x)  —  p$ 

tyl i  •••? tym 
Ai  j  •  •  •  >  A m  _ 

fv’ij  •••»  V 

Ai , . . . ,  Xn 


rifk 


Ai )  • .  •  j  Xjn 


U)+ 
(*)• 


It  would  be  desirable  for  numerical  purposes  to  compute  the  generalized 
divided  differences  not  by  means  of  the  clumsy  determinant  formula  given  in 
Lemma  10  but  only  from  previously  calculated  divided  differences  as  in  the 
Newton  polynomials  in  E. 


5.1  A  linear  system  for  divided  differences 


Before  presenting  Miihlbach’s  recurrence  relation  for  generalized  divided  dif¬ 
ferences  in  the  recovery  case  we  present  a  related  result  which  already  allows 
the  computation  of  the  divided  differences  as  solutions  of  linear  systems  of 
equations. 

Theorem  12.  For  any  4> :  L2  E,  §  €  V,  the  generalized  divided  differences 


Oik 


ipi, . . 

Tn 

#' 

L  Ax,  .. 

•  •  >  An 

k 

,  k  =  m  +  1, . . . ,  n, 


are  uniquely  determined  as  solution  of  the  system  ofn  —  m  linear  equations 


^Cfc=m+1  ak  '  rTk 


<Pi,  ■ 
[Ai,. 

■  • ,  Tm 

•  • ,  Am 


for  i  =  m  +  1, . . .  ,n. 


5.2  A  recurrence  relation  for  divided  differences 

The  method  of  computing  the  generalized  divided  differences  described  in 
Theorem  12  does  not  use  previously  computed  divided  differences  exclu¬ 
sively  but  requires  the  computation  of  recovery  error  terms.  In  transfering 
Miihlbach’s  generalized  recurrence  relation  for  divided  differences  in  interpo¬ 
lation  problems  to  the  recovery  case  we  finish  the  description  of  Miihlbach 
expansions. 
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Theorem  13.  Let  m  <  n  be  integers  and  suppose  that  pi,...,pn  consti¬ 
tute  an  I- Tschebyscheff  system,  I  =  (Ai, . . . ,  \n)T ,  such  that  its  subsystem 
Pi , . . . ,  pm  is  also  an  I- Tschebyscheff  system  with  regard  to  Ai , . . . ,  Am .  For 
$  £  V  let  a  denote  the  vector 


&m+l 


a  — 


a 


n 


Pi,  ■■■ 

>  Pn 

$ 

11 

Ai,  -  ■  ■ 

,  A„  |  m  +  1J 

'¥>1,  ■ 

■■■,Pn 

Ai, 

’  • • > An 

n 

E  Rn-m 


of  generalized  divided  differences.  Then  a  is  uniquely  determined  as  solution 
of  the  linear  system 

C_a  =  u, 

where  Q_  £  Rm(n-m)*(n-m) _  If  Q_—  (cx , . . .  ,cn_m)  is  the  representation  of  C_ 
in  terms  of  the  column  vectors  ck,  then  the  k-th  column  vector  is  given  by 


Pi  j  •  •  •  j  (pm 

( Pm+k 

Pi  >  •••)  Pm 

Pm+k 

A2,  Am+1 

1 

Ai ,  .  • . ,  Am 

1 

Pi,  ■ 

■■,Pm 

Pm+k 

Pi,  • 

•  •  >  Pm 

Pm+k 

An-m+1 )  ■ 

•  ,  A  71 

1 

An— m>  • 

•  •  >  An-i 

1 

>1,  .. 

i  Pm 

Pm+k 

‘ 

>1,  ... 

>  Pm  Pm+k 

A2,  . . 

,  Am+1 

TO 

Ai,  . . . 

5  A  m  Wl 

Pi,  • 

•  j  Pm 

Pm+k 

Pi,  ■ 

•  •  > 

Pm+k 

An— m+1 , . 

•  ,  An 

TO 

An— m>  • 

.  .  ,  An__i 

m 

while  the  right  hand  side  u  £  Mm("  m'>  is  given  by 


Pi  j  •  •  •  j 

(/?1 ,  . .  . ,  ipm 

A2j  •  •  •  >  Am+1 

i 

Ai  ?  •  •  •  ?  Am 

i 

pi, 

•  •  •  i  Pm 

$ 

An— m+1  > 

■  •  •  >  An 

i 
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i  Pm 

$ 
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m 
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$ 
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TO 
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£ 

1 
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#' 

An— m  ?  •  • 

V 

3 

1 

TO 

20 


Remi  Abgrall  et  al. 


5.3  Some  examples 

An  interpolatory  example  We  explain  the  notions  introduced  above  by 
means  of  a  simple  example  corresponding  to  I#  =  {<k{x1),$(x2),${x3))T  ^ 
x.i,x2,  x3  forming  a  non-degenerated  triangle  in  R2 .  The  Miihlbach  expansion 
is  sought  with  respect  to  the  system  ,  <p2 ,  ¥3  defined  by  <p,(x)  :=  aj,0  + 
alwxi  +  al01X2,<Pi(xk)  =  6f,i,k  —  1,2,3.  Thus,  we  consider  the  linear  finite 
element  functions  on  the  triangle  spanned  by  x x , x2 ,  X3  interpolating  <P  at 
these  points,  i.e.  the  three  linear  functions  taking  the  value  1  on  one  node  of 
the  triangle  while  vanishing  on  the  remaining  two  nodes.  We  would  expect 
the  interpolant  to  be  the  function  $(xx)<px(x)  +  $(x2)(p2(x)  +  #(x3)<p3(x). 

The  Miihlbach  expansion  can  be  written  in  the  form 


p$ 


¥h,  ¥2,  ¥3 

>  &X2  ’  ^  X  3 


(x)  = 


+ 

+ 


p$  ^ 
¥>1,  <P2,  ¥3 
fixt  1  &x2 ) 

¥U  ¥2,  ¥3 

)  ^x2  ,  fix3 


$ 

2 

#' 

3 


(z) 

r<P2 
rp  3 


¥1 

6X 

iv 


(®) 

(a)- 


According  to  Lemma  8  the  function  p# 


=  Pi$  satisfies  the  recovery 


condition  (i.e.  interpolation  condition)  pi#(xx)  =  s£(xx)  and  can  be  written 
in  the  form  px#(x)  =  ai<pi(x).  Thus,  pi#(xx)  =  $(xx)  =  ai<px(xx)  =  aq 
and  therefore  pi#(x)  =  #(xx)y>x(x).  According  to  Lemma  10  we  furthermore 
have 


Analogously, 


¥li  ¥>2,  ¥3 

"Xj  I  fix2  ,  <^x3 

2 

¥i>  ¥2,  ¥3 

fix  J  ,  fix  2  ,  fix^ 

3 

Hx2), 


$(*3)- 


Also  according  to  Lemma  10  we  have  r<p2 


(x)  =  ¥2  (x)  and  rip3 


ip3  (x) .  Therefore,  the  Miihlbach  expansion  results  inp3<P(x)= 
which  indeed  is  the  required  interpolant. 


&(x.i)<pi(x), 


An  efficient  algorithm  for  quadratic  polynomial  recovery  We  show 
that  the  problem  of  recovering  a  quadratic  polynomial  from  cell  averages  can 
be  broken  up  into  two  3  x  3-systems  instead  of  solving  one  6  x  6-system  of 
equations.  A  quadratic  polynomial  is  sought  on  the  triangulation  shown  in 
Figure  13.  With  each  of  the  nodes  x{,i  =  1,...,6,  we  associate  the  linear 
functional  =  (  A  (C,),  ■).  As  already  explained,  a  direct  computation  of 
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a  recovery  polynomial  p(x)  :=  S|a|<2  aoX—  satisfying  the  recovery  condi¬ 
tions  (  A  ( Ci),p )  =  (  A  ( Ci),$ )  for  i  =  1, ...  ,6,  would  require  the  solution 
of  a  6  x  6-system  for  the  unknown  coefficients  a&-  The  determinant  of  the 
coefficient  matrix  of  this  system  is  of  generalized  Van  der  Monde-type,  thus 
we  have  high  condition  numbers  resulting  in  numerical  problems  during  the 
solution  process.  Anyway,  it  would  be  desirable  to  break  down  the  computa¬ 
tion  into  smaller  subproblems.  We  now  show  that  Miihlbach  expansions  can 
accomplish  this  task.  We  assume  that  the  triangulation  is  chosen  such  that 
polynomials  of  degree  not  exceeding  two  form  an  I-Tschebyscheff  system. 
Note  that  this  can  always  be  assured  in  practice. 

On  the  triangle  Tm in  shown  in  Figure  13  we  compute  three  linear  finite  el¬ 
ement  functions  Ai,A2,  A3  according  to  the  recovery  conditions  A  (Ci)Aj  = 
Sj  ,  i,  j  E  {1, 2, 3}.  According  to  our  notations  these  three  functions  comprise 
the  recovery  function 


p3<P  =  p$ 


Ai,  A2,  A3 
A  (Ci),  A  (C2),  A  (Cs) 


which  can  be  thought  of  as  the  linear  part  in  a  Miihlbach  expansion.  Defin¬ 
ing  the  remaining  functions  A^{x)  =  A1A2  ,  A5(x)  =  Af,  A6 (x)  =  A\,  the 
complete  Miihlbach  expansion  is  then  given  as 


p$ 


(a)  = 


+  £*= 4  «*  ■ rA 


Al,  ...,  ^6 

A  (CO,...,  A(C6) 

Ai ,  A2,  A3 

A  (Ci),  A  (C2),  A  (C3) 
Ai,  A2, 


A  {Ci),  A  (C2),  A  (C3) 


where  the  a^s  denote  the  generalized  divided  differences  again.  Due  to  the 
recovery  properties  of  the  linear  functions  Ai,i  =  1,2,3,  it  follows  that 


(  Ai,  A2,  A3  A  _  1 
V  A  (Ci),  A  (C2),  A  {C3)J-1 


Thus,  according  to  Lemma  10  we  obtain 

r3Ak(x)  =  Ak(x)~Y?i=1  Ae(x)  {  A  (Ct),  Ak) , 

for  k  e  (4, 5, 6},  and  r3<P  =  §(x)  -  £^=1  Ak{x)  {  A  {Ci),  $) .  Following  Theo¬ 
rem  12  the  divided  differences  can  be  computed  as  solutions  of  a  3  x  3-system. 
In  our  case  it  is  easy  to  verify  that  the  system  has  the  form  A  a  =  u  with 

4  =  (av)i<i,j<3  siven  by 

3 

aij  =  A  {Ci+3)Aj+3  -  ^2  A  [Ci+3)At  ■  {  A  (Ce),Aj+3) , 

t=\ 

the  right  hand  side  u  =  {b\,  b2, 63  )T  is 

3 

bi=  A  (Ci+3)$  A  (« Ci+3)At  ■  (  A  {Ct),  $) , 


22 


Remi  Abgrall  et  al. 


and  a  =  (ai,a2,a3)T.  Thus,  the  process  of  recovery  can  be  conveniently 
broken  down  into  smaller  subproblems  by  using  Miihlbach  expansions. 

Remark  14-  In  [3],  the  computation  of  the  polynomial  expansion  %>  was  car¬ 
ried  out  by  using  the  error  r3$,  as  here.  Instead  of  introducing  the  error 
functions  and  rsA6,  the  error  was  expanded  in  terms  of  the 

.dj’s,  i  =  1, . . . ,  6.  Then  the  6  x  6-system  is  reduced  to  a  3  x  3  one.  It  turns  out 
that  the  method  of  [3]  exactly  reduces  to  the  one  presented  here  for  degree  2. 
For  higher  degree,  say  degree  r,  the  method  of  [3]  needs  the  solution  of  two 
linear  systems.  One  of  them  is  a  (r  +  1)  x  (r  +  1)  system,  the  other  one  is 
a  (l+1Mr+2)  x  (r+lKr+2)  system.  It  is  clear  that  the  present  method  is  much 
more  efficient  in  general. 

6  The  ENO  reconstruction 

6.1  ENO  on  general  meshes 

In  [1],  we  have  found  that  only  a  few  stencils  were  indeed  necessary  to  achieve 
an  essentially  non-oscillatory  reconstruction  of  a  piecewise  smooth  function 
on  a  triangular  mesh.  This  set  has  to  be  as  isotropic  as  possible.  Moreover,  the 
ENO  reconstruction  was  found  to  achieve  the  expected  order  of  accuracy  for 
smooth  functions,  even  on  very  irregular  meshes.  In  what  follows,  a,ij  always 
stands  for  any  of  the  coefficients  of  the  reconstruction  P  in  the  natural  basis, 
{(z  —  %oY(y  —  2/o)J}- 

Then  we  can  apply  the  procedure  of  [1]  in  a  straightforward  manner.  Let 
us  describe  our  procedure  for  reconstruction  up  to  third  order:  (i)  We  start 
from  a  given  cell,  Co,  assigned  to  a  point  of  M,  say  {x0,yo)  ;  (ii)  Consider  all 
the  triangles  having  (xo,2/o)  as  a  vertex,  and  choose  the  one,  say  Tmin,  that 
minimize  £)i+J=1  Here,  <Si  is  the  set  of  control  volumes  located  around 
the  vertices  of  Tmin,  (see  Figure  13).  For  a  regular  unstructured  mesh,  there 
are  six  possible  triangles,  (iii)  Consider  Tmj„.  For  each  of  its  edges,  consider 
the  three  triangles,  2\,  T2,  T3  as  in  Figure  13-a.  We  choose  the  configuration 
that  minimize  the  sum 

X ]  M- 

i+j= 2 

What  can  be  done  for  fourth  (and  higher)  order  reconstruction  is  explained 
in  [1], 


6.2  Numerical  examples 

We  have  performed  several  tests  on  the  second,  third  and  fourth  order  ENO 
interpolation  and  ENO  reconstruction,  but  we  only  report  the  third  order 
results  since  they  are  a  priori  more  computationally  interesting.  In  particular, 
we  intend  to  check  numerically  that  the  expected  order  of  accuracy  is  in  fact 
reached  for  smooth  functions. 
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In  all  these  examples,  we  have  assumed  that  the  control  volumes  are 
elements  of  the  dual  mesh.  The  practical  calculations  of  the  averages  in  these 
control  volumes  have  been  performed  with  a  5-th  order  quadrature  formula 
([37],  Table  4.1,  p.184). 

The  tests  on  smooth  functions  are  performed  on: 

u(x,  y)  =  cos(27r(a:2  +  y2)).  (6.1) 


All  the  error  estimates  have  been  obtained  on  irregular  meshes  as  the  one 
presented  in  Figure  13.  The  main  difference  between  such  a  mesh  and  the 
regular  structured  one  is  that  the  number  of  triangles  each  node  belongs  to 
is  different.  We  also  have  done  the  same  tests  with  regular  meshes,  and  we 
have  not  seen  any  degradation  of  the  convergence. 

The  locally  smooth  function  we  have  chosen  is  obtained  by  a  modification 
of  the  one  used  by  Harten  in  [15]  for  example  :  if  0  is  any  angle,  let  a  function 
be  defined  by: 


U(x,y)  =  { 


-r  sin(1.57rr2)  ;r<—  | 

2r  —  1  +  |  sin  (37rr)  ;  r  >  |  , 

|  sin  (2vrr)  |  ;  |r|  <  | 


(6.2) 


where  r  =  x  -  From  we  finally  define  u  to  be: 


u{x,y) 


'  ;x<icos7T  y 

* 

y )  +  cos  i^iry)  ;  X  >  |  cos  Try 


(6.3) 


The  function  defined  by  (6.2)-(6.3)  has  discontinuities  in  the  function  itself 
and  its  first  order  derivatives;  some  of  the  discontinuities  are  straight  lines 
(never  aligned  to  the  mesh),  one  is  a  curved  line  where  the  jump  changes  from 
one  point  to  another.  Last,  the  behavior  of  u  is  basically  one-dimensional  on 
the  left  of  the  curve  x  =  cos7ry/2  and  really  two-dimensional  on  the  right. 

A  plot  of  this  function  is  given  in  Figure  13-(B).  One  should  obtain 
straight  lines  and  smooth  transitions  at  discontinuities  contrary  to  what  is 
shown  in  the  Figure:  this  is  an  effect  of  the  plotting  procedure  in  which  linear 
finite  element  hat  functions  are  used  for  interpolation  purposes. 


Results  on  the  smooth  function  We  have  displayed  in  Table  13.1  the 
L°°-error  of  the  third-order  ENO  reconstruction.  The  experiments  have  been 
done  in  two  different  contexts.  The  column  “(a)”  of  Table  13.1  corresponds 
to  different  meshes  that  have  been  generated  independently.  In  this  case,  the 
constant  C  of  Theorem  1  is  different  for  each  mesh,  so  that  the  slope  —3  has 
to  be  expected  in  the  mean  only.  The  column  “(b)”  of  Table  13.1  corresponds 


24  Remi  Abgrall  et  al. 


to  meshes  that  have  been  successively  refined:  the  same  constant  C  appears, 
and  the  slope  -3  is  recovered  much  better.  Here  h  is  the  maximal  radius  of 
the  circumscribed  circles  of  the  triangles,  rc  is  a  number  such  that  the  error 
is  proportional  to  hrc . 


Results  on  the  nonsmooth  function  In  Figure  13- (A),  we  have  displayed 
the  nodal  values  of  the  third-order  ENO  reconstruction  for  the  mesh  shown 
in  Figure  13.  There  are  no  oscillations  in  the  reconstruction.  In  [1],  the  same 
representation  is  given  for  the  fourth  order  reconstruction,  and  the  only  visi¬ 
ble  difference  is  a  better  resolution  of  the  area  surrounding  the  triple  points. 

Where  the  function  is  smooth,  we  should  recover  the  asymptotic  order  of 
convergence  obtained  for  a  smooth  function.  In  order  to  check  this,  we  have 
computed  the  error  between  the  reconstruction  and  the  exact  function  u  at 
different  points  of  the  line  y  =  0,  namely  at  x  =  0.4,  0.2,  0.1,  0,  —0.2,  —0.5, 
—0.75.  The  results  are  presented  in  Table  13  for  third-order  accuracy.  We  get 
what  is  expected.  In  particular,  the  point  x  =  0  is  on  the  line  where  u  is 
continuous  but  its  first-order  derivative  has  a  jump,  so  that  only  a  first-order 
approximation  is  obtained  in  any  case.  Elsewhere,  a  third  order  accuracy  is 
recovered. 

In  [1],  another  selection  procedure  has  also  been  proposed.  It  includes  a 
much  richer  set  of  stencils  but  no  real  improvement  has  been  noticed.  From 
all  our  experiments,  we  can  conclude  that  the  choice  discussed  here  is  indeed 
sufficient. 

7  Weighted  ENO  reconstruction 

Although  the  results  for  ENO  reconstruction  are  reasonable,  critical  investi¬ 
gations  show  two  weak  points  of  the  approach: 

1.  In  smooth  regions  the  accuracy  is  not  as  good  as  for  TVD  schemes. 

2.  Convergence  for  steady  state  flows  is  usually  not  achieved. 

In  this  section  we  are  facing  these  problems  and  show  how  they  can  be  cir- 
cumwented. 


7.1  Motivation  of  WENO  reconstruction 

The  main  idea  of  ENO  schemes  is  to  compute  several  candidates  P*  for  a 
reconstruction  P  and  to  choose  that  one  with  the  lowest  oscillation.  If  we  as¬ 
sume  that  we  are  in  a  smooth  region  of  the  flow  then  non  of  the  Pi  does  really 
oscillate.  In  this  context  choosing  that  candidate  Pi  with  the  lowest  oscilla¬ 
tion  means  to  choose  the  flatest  reconstruction  or  to  maximize  dissipation. 
This  explains  why  the  accuracy  is  not  that  good  in  smooth  regions. 

Furthermore  it  is  obvious  that  if  there  are  enough  candidates  P*  then 
there  will  be  more  than  one  candidate  with  a  comparable  low  oscillation. 
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Thus,  even  small  changes  of  the  data  will  force  a  switch  from  one  candidate 
to  another.  This  digital  switching  prevents  convergence  of  the  scheme  for 
steady  state  flows. 

Both  drawbacks  can  be  removed  or  at  least  reduced  by  modifying  the 
ENO  scheme  in  the  following  way:  Instead  of  digitally  selecting  the  least 
oscillating  reconstruction  we  use  a  weighted  sum: 

P:=Y, UiPi' 

i 

The  positive  weights  with  u>i  =  1  are  choosen  such  that  u>i  is  small 
if  the  oscillation  of  Pi  is  high  and  ui  is  larger  for  less  oscillating  Pi.  This 
scheme  is  then  called  weighted  ENO  scheme  (WENO).  It  was  introduced  for 
the  one-dimensional  case  in  [40,24]  and  applied  to  the  case  of  unstructured 
grids  in  two  dimensions  in  [21]. 


7.2  Choice  of  weights 

For  the  computation  of  weights,  it  has  to  be  clarified  how  the  oscillation  of  P 
is  measured.  From  theorem  3  one  comes  to  the  conclusion  that  X^+J-_n  I aij 
should  be  used.  However,  numerical  tests  (see  [21])  have  demonstrated  that 
this  oscillation  measure  is  not  well  suited  as  a  base  for  the  weights.  Much 
better  results  are  obtained  using  the  following  quantity: 

osc(P)  :=f  V  [  P{x)\\2dx 

\<i+j<nJc 

where  C  is  the  cell  P  has  to  computed  for  and  h  =  y|C7|. 

The  oscillation  measures  osc(Pi)  are  then  used  to  compute  the  weights  as 
follows: 

Qi  :=  (e  +  osc(Pi))~r, 


where  r  is  positive,  and 


Wi  := 


E 


Note,  that  if  there  is  exactly  one  Pi  with  a  maximum  oscillation  then  the 
WENO  scheme  tends  to  the  classical  ENO  scheme  if  r  tends  to  infinity.  On 
the  other  hand,  if  r  tends  to  zero  then  the  oscillation  will  not  be  taken  into 
account  for  the  weights,  which  means  that  the  scheme  will  become  an  unstable 
central  scheme. 

Numerical  tests  showed  that  r  =  4  is  large  enough  to  hold  the  scheme 
stable  even  for  flows  with  strong  discontinuities  and  small  enough  to  obtain 
a  significant  improvement  over  the  ENO  scheme  for  smooth  flows. 
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7.3  Required  modifications  of  the  reconstruction  algorithm 

It  would  be  a  violation  of  the  main  idea  to  use  the  hierachical  recovery  algo¬ 
rithm  described  in  §5  for  third  order  reconstruction  which  leads  to  a  number 
of  only  three  stencils  for  polynomial  degree  two.  For  WENO  reconstruction 
the  linear  part  of  the  Miihlbach  expansion  can  not  be  fixed.  Instead,  the 
stencils  for  all  the  triangles  and  not  only  one  have  to  be  taken  into  account. 
For  each  of  the  triangles  Miihlbach  expansions  are  used. 


7.4  A  stencil  selection  algorithm  that  does  not  need  triangles 

In  the  last  sections  we  have  described  the  ENO  and  WENO  reconstruction 
algorithm  using  a  triangulation  to  select  stencils.  This  is  no  principal  re¬ 
striction  on  the  kind  of  used  control  volumes  as  was  stated  before  because 
a  triangulation  of  the  control  volumes’  centers  can  always  be  constructed  to 
obtain  the  required  topological  information. 

However,  this  technique  may  be  impractical  and  one  may  wish  to  select 
stencils  for  a  finite  volume  grid  without  the  need  of  a  triangulation. 

Grids  like  the  dual  mesh  of  a  triangulation  and  also  grids  obtained  from 
a  dual  mesh  by  fusing  together  cells  own  a  nice  topological  property:  If  the 
boundaries  of  two  cells  have  a  common  point  then  they  already  have  a  com¬ 
mon  edge.  This  property  is  not  given  for  primary  triangular  grids  and  also 
not  for  quadrilateral  grids  where  cells  can  touch  at  single  points. 

In  the  following  we  call  cell  b  a  neighbour  of  cell  a  if  their  boundaries 
have  have  a  common  edge.  We  say  cell  b  is  touching  cell  a  if  their  boundaries 
have  a  common  edge.  With  this  definition  the  topological  property  described 
above  means  that  for  this  kind  of  grids  touching  cells  are  already  neighbours. 

For  this  kind  of  grids  the  stencil  selection  algorithm  described  in  [21]  is 
used: 

Polynomial  degree  1:  For  a  polynomial  degree  n  =  1  the  required  stencil  size 
is  three.  We  select  all  that  sets  of  three  cells  {C(,Ca,Ci}  as  stencils  for  cell 
Ct  which  have  the  following  properties: 

-  Ca  is  a  neighbor  of  Ct,  and 

-  Cb  is  a  neighbor  of  Ct  and  of  Ca . 

Polynomial  degree  2:  For  a  polynomial  degree  n  —  2  the  required  stencil  size 
is  six.  We  select  all  that  sets  of  six  cells  {Ci,Ca,Cb,Cc,Cd,Ce}  as  stencils 
for  cell  Ci  which  have  the  following  properties:  First,  {Ct,Ca,Cb}  has  to  be 
a  selected  stencil  for  polynomial  degree  n  =  1.  Second,  Cc,  Cd  and  Ce  have 
to  fulfill  one  of  the  following  three  conditions  (see  figure  13  for  an  example 
of  each  type.  Ct  is  dark  shaded): 

1.  Central  stencil: 

-  Cc  is  a  neighbor  of  Ct  and  of  Ca,  and 
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-  Cd  is  a  neighbor  of  Ce  and  of  C\,  and 

-  Ce  is  a  neighbor  of  Ce  and  of  either  Cc  or  Cd- 

2.  Almost  central  stencil: 

—  Cc  is  a  neighbor  of  Ce  and  of  Ca,  and 

-  Cd  is  a  neighbor  of  Ce  and  of  Ci,  and 

—  Ce  is  a  neighbor  of  Ca  and  of  Cj. 

3.  One-sided  stencil: 

-  Cc  is  a  neighbor  of  Ca  and  of  C&,  and 

-  Cd  is  a  neighbor  of  Ca  and  of  Cc,  and 

-  Ce  is  a  neighbor  of  Ci  and  of  Cc. 

Note,  that  apart  from  the  central  stencils  this  results  to  the  same  stencils 
as  those  described  in  §5.3. 

8  Other  recovery  techniques 

In  all  what  preceded,  we  have  worked  with  piecewise  polynomial  reconstruc¬ 
tions.  This  is  quite  standard  thanks  to  the  ease  of  computing  polynomials. 
This  is  also  accurate  since  in  the  regular  case  we  have  error  estimates.  How¬ 
ever,  one  might  wonder  whether  this  is  optimal  in  the  sense  of  minimizing 
the  error  between  the  reconstruction  and  the  function  u  to  be  reconstructed. 
Since  the  latter  is  known  only  through  its  average  values  on  the  control  vol¬ 
ume,  it  is  better  to  ask  that  the  distance  between  the  reconstruction  and 
the  space  in  which  u  lives,  is  minimized  provided  some  linear  constraints  are 
added. 

Let  us  give  a  simple  example.  It  is  well  known  that  the  Lagrange  interpola¬ 
tion  is  not  optimal  when  we  want  to  interpolate  data  while  minimizing  other 
quantities,  like  a  norm  of  derivatives.  Let  a  =  xo  <  aq  <  •  •  •  <  xn-\  <  xn  =  b 
and  yi,i  =  1,  •  •  • ,  n  and  if  one  wishes  to  minimize  J  [f  ]2(x)dx  in  the  space 
of  continuously  twice  differentiable  functions  with  the  constraints  f(xe)  =yi, 
the  answer  is  given  by  cubic  splines. 

Since  accuracy  as  well  as  robustness  of  such  approximations  applied  to 
hyperbolic  conservation  laws  depend  mainly  on  the  recovery  algorithm  it 
makes  sense  to  ask  for  recovery  algorithms  satisfying  an  optimality  condition. 
It  turns  out  that  the  solution  to  this  class  of  problems  can  be  found  in  an 
abstract  setting  in  the  papers  by  Golomb  and  Weinberger  [13]  and  Micchelli 
and  Rivlin  [29],  in  which  a  theory  of  optimal  recovery  is  developed.  Within 
this  theory  one  is  able  to  show  that  polynomial  recovery  is  only  optimal  in  a 
trivial  sense. 

The  idea  of  applying  the  theory  of  optimal  recovery  to  numerical  approx¬ 
imations  of  differential  equations  goes  back  to  Morton  and  his  co-workers 
in  1988,  see  [7].  They  considered  finite  element  approximations  and  used 
piecewise  polynomial  recovery  to  get  information  about  point  values  and 
derivatives  of  the  unknown  solution. 

The  details  of  these  recovery  techniques  can  be  found  in  [20,36]. 
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An  example  is  the  following.  Instead  of  taking  polynomial  expansion  (4.1), 
the  following  expansion  is  considered  (this  corresponds  to  a  spline  in  a  Beppo- 
Levi  space,  the  so-called  thin-plate  spline) 

m  =  Ejo1  Ai  (AiC^y,  (I*  -  J/I2  log(|*  -  3/1))) 

(8.1) 

+  «oo  +  flio®  +  ooi  y 


where  the  averaging  process  is  done  with  respect  to  the  variable  y,  and  the 
integrals  are  computed  with  a  quadrature  formula.  The  cells  Cq.  belong  to 
a  stencil  Si  around  the  node  M*  of  the  mesh.  It  is  constructed  in  the  same 
spirit  as  before.  The  constraints  are 

(  A  (Ci,.),  R)  =  (A(Cij),u),  k  =  1,  •  •  • ,  M  -  1, 
where  Ci0  is  the  control  volume  associated  with  M,,  and 

M— 1 

for  all  p  e  {1, x,y}:  ^  A j  (  A  (Cj,),jp)  =  0. 

l=i 


This  gives  a  (M  +  3)  x  (M  +  3)  linear  system  that  is  solvable  if  the  stencil 
Si  contains  an  admissible  stencil  of  3  elements  for  the  linear  reconstruction. 
The  expansion  (8.1)  minimizes 


■/(«)  = 


dxdy. 


If  M  —  #Si  =  3,  R  is  a  linear  polynomial.  Thus,  in  practical  applications, 
a  stencil  of  4  elements  is  taken.  For  ENO  applications,  the  reconstruction  is 
performed  on  the  stencil  that  has  the  smallest  total  variation,  computed  once 
more  by  a  quadrature  formula. 

Several  numerical  applications  have  been  tried  with  this  technique,  in  par¬ 
ticular  in  [36],  on  rotating  cone  problems  and  the  Collela  and  Woodward  test 
case  of  supersonic  flow  in  a  channel  with  foreward  facing  step.  Improvements 
with  respect  to  linear  reconstruction  technique  are  reported  there,  they  are 
particularly  pronounced  for  the  rotating  cone  problem. 


9  A  class  of  high  order  numerical  schemes  for 
compressible  flow  simulations 

We  have  applied  the  polynomial  reconstruction  method  to  various  test  cases, 
with  polynomial  of  degree  2,  on  various  test  cases.  Here,  as  said  before,  the 
physical  variables  are  approximated  in  the  setting  of  §2. 

We  have  reduced  the  order  of  accuracy  of  the  reconstruction  for  cells  that 
are  too  close  to  the  boundary.  For  them,  a  proper  calculation  of  the  ENO 
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stencil  may  be  impossible  because  the  set  of  possible  stencils  is  biased  in  one 
direction  due  to  the  boundary.  For  the  third-order  scheme,  these  cells  are 
those  related  to  a  mesh  point  that  belongs  to  a  triangle  having  at  least  one 
point  on  the  boundary. 


9.1  Numerical  tests 

All  the  examples  we  propose  now  have  been  computed  with  the  second  and 
third-order  ENO  schemes.  The  ratio  of  specific  heats,  7  is  always  set  to  1.4. 
We  present  numerical  computations  of  the  reflection  of  a  shock  on  a  wedge. 
Other  calculations,  including  the  Collela  and  Woodward  test  case  and  2D 
shock  tube  problems  can  be  found  in  [3,2]. 

In  these  two  examples,  the  post  shock  conditions  are  p  =  j,u  =  v  =  0  and 
p  =  1.  The  preshock  conditions  are  determined  from  the  Rankine-Hugoniot 
relations  with  a  shock  Mach  number  of  5.5.  The  only  difference  between  the 
two  cases  is  the  angle  of  the  wedge,  6  =  30°  in  one  case  and  9  =  45°  in  the 
other  one.  The  kind  of  mesh  we  use  is  also  different.  In  the  first  case,  it  is 
a  triangular  mesh  with  8569  nodes  and  16806  triangles,  in  the  second  one  it 
is  made  of  squares  and  triangles  on  the  boundary.  It  has  23990  nodes  and 
23771  elements  (triangles  and  quadrangles). 

In  the  first  case,  we  have  a  double  Mach  reflection,  [9].  By  comparing  the 
density  displayed  in  Figures  13.6-A  and  13.6-B,  it  is  clear  that  our  3rd  order 
ENO  scheme  improves  the  resolution  of  the  various  features  of  the  flow.  In 
particular,  the  slip  line  coming  out  of  the  triple  point  is  clearly  visible  on 
Figure  13.6-B  while  barely  existing  on  Figure  13.6-A. 

The  second  example  is  even  more  interesting.  First  it  shows  that  our 
methodology  is  easily  extendable  to  more  general  meshes,  see  Figure  13.8. 
Second  the  test  case  itself  demonstrates  the  improvement  between  accuracy 
of  first  order  (Figure  13.9- A),  second  order  (Figure  13.9-B)  and  third  order 
(Figure  13.9-C).  Following  Ben  Dor  [9],  Figure  2.42-c,  page  102,  we  see  that 
6  =  45°  and  M  =  5.5  corresponds  to  a  double  mach  reflection  very  close  to 
the  regular  reflection  transition.  On  Figure  13.9-A,  we  see  a  regular  reflection. 
On  Figures  13.9-B  and  13.9-C,  we  see  double  mach  reflection,  but  the  details 
of  the  internal  shock  and  the  slip  line  coming  out  of  the  triple  point  are  much 
better  resolved  in  the  third  order  simulation. 


9.2  Some  remarks  on  the  formal  accuracy  of  the  scheme 

We  would  like  to  point  out  some  difficulties  of  these  high  order  finite  volume 
schemes  that  have  been  apparently  unnoticed  yet.  Following  many  authors, 
we  have  recovered  the  density,  the  x-  and  y—  component  of  the  velocity  and 
the  pressure.  The  choice  of  the  last  three  variables  is  dictated  by  the  fact 
that  (i)  the  density  should  remain  positive,  (ii)  the  pressure  is  a  Riemann 
invariant  and  should  remain  positive,  (iii)  in  a  Riemann  problem,  the  normal 
component  of  the  velocity  is  also  a  Riemann  invariant,  hence  the  choice  of 
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the  velocity  component  may  be  wise.  Nevertheless,  all  we  have  said  on  the 
recovery  problem  assumes  that  the  variable  to  be  reconstructed  is  a  conserved 
one  which  is  true  for  the  density  only.  So  one  may  question  the  validity  of 
our  approach. 

We  first  discuss  the  case  of  the  velocity.  In  fact,  the  starting  point  of  the 
reconstruction  procedure  is  the  averaged  velocity: 

{A{C),  (pu))  =  Jcpudx 
c  (A  (C),p)  Scpdx 


and 


vc  = 


{A(C),{pv))  _  Jcpvdx 


(A  (C),p)  Jcpdx' 

Thus  uc  and  vc  appear  to  be  true  averages,  not  with  respect  to  the  measure 
fc  'dx 


(A  (CO,-) 


fc  dx 


but  with  respect  to  the  measure  ( -)c 


one  can  easily  convince  onself  that  all  that  has  been  done  with  (  A  ( C ),  •)  is 
also  true  for  (-)c,  and  things  become  clear. 

This  is  no  longer  true  for  the  pressure,  since  the  “averaged”  pressure  is 

1 


trr)={A{c)'E) 


2<A  (C),p) 


{(A(<7),M)2  +  (A(C),(H>2} 


—  (  A  [C),p)  +  TZ 


where 


2K=(A  (C),(pu)2)  +  (  A  (C),  (pv)2)  - 


(  A  (C),  {pu))2  +  (  A  (C),  {pv))2 
(A  (C),p) 


If  there  exists  a  measure  fi  such  that  pc  =  fcp  dp,  a  necessary  condition  is 
Ti  =  0.  Unfortunately,  this  can  not  be  expected  in  general. 

An  alternative  to  these  problems  is  to  work  directly  on  the  conserved  vari¬ 
ables,  but  then  there  is  no  control  on  the  positivity  of  the  pressure.  However, 
the  improvement  in  accuracy  is  obvious,  despite  all  these  problems,  as  it  can 
be  seen  from  Figure  13.6-(A),  13.6-(B)  and  13.9. 

We  end  this  set  of  remarks  by  noticing  that  in  the  second  order  case,  there 
is  no  problem  because  one  can  interpret  the  averaged  quantities  as  their  values 
at  the  centroid  of  the  control  volume  with  second  order  accuracy.  Then  the 
“averaged”  pressure  has  to  be  understood  as  the  pressure  at  the  centroid, 
with  second  order  accuracy. 


10  Multiresolution  Analysis 

10.1  Introduction 

The  simulation  of  engineering  problems  requires  more  and  more  sophisticated 
numerical  models,  finer  and  finer  meshes  discretizing  complex  geometries. 
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Even  with  the  most  powerful  computers,  these  tasks  are  very  challenging 
and  cheaper  computing  techniques  are  needed. 

The  modern  numerical  methods,  such  as  the  TVD  or  ENO  schemes,  use 
many  switches  that  are  essential  only  in  a  small  part  of  the  flow.  To  reduce 
their  CPU  cost,  the  use  of  the  solution  structure  appears  as  an  appealing 
guide  to  a  better  distribution  of  the  computer  resources.  This  goal  can  be 
achieved  via  multiresolution  (MR)  analysis.  Recently,  A.  Harten  has  devel¬ 
oped  a  framework  that  is  general  enough  to  contain  some  of  the  wavelets 
families  [12]  on  R  but  can  also  be  applied  when  the  data  are  represented 
on  unstructured  meshes  by  cell  averages,  the  natural  output  of  finite  volume 
schemes. 

Here,  we  first  describe  a  technique  to  represent  data  which  originate  from 
discretizations  of  functions  in  unstructured  meshes  in  terms  of  their  local 
scale  components  and  give  some  numerical  applications.  Then,  we  show  how 
to  exploit  a  particular  version  of  Harten’s  multiresolution  analysis  to  reduce 
their  CPU  cost.  Last,  we  provide  some  numerical  illustrations. 


10.2  Harten’s  multiresolution  analysis  on  general  meshes 

This  section  is  a  very  compact  resume  of  [5].  We  consider  a  domain  fi,  with 
a  triangulation  T(C). 

We  construct  a  set  of  control  volumes  (Ci)i-itN  as  before  ;  they  should 
exactly  cover  J?  such  that  if  i  ^  j,  Ci  C\Cj  —  0.  In  all  the  numerical  examples, 
the  control  volumes  are  the  elements  of  a  dual  mesh.  If  /  belongs  to  Ll(Q), 
we  can  represent  f  by  its  average  values  A  (Ci)}.  The  idea  is  to  represent  / 
not  by  the  set  (  A  ( Ci)f)i=i...N  but  by  an  equivalent  representation  made  of 
the  cell  averages  on  a  coarser  mesh  and  a  set  of  scale  coefficients  that  measure 
the  difference  in  information  between  the  representation  of  /  on  coarser  and 
coarser  meshes. 

The  method  needs  three  ingredients:  (i)  an  agglomeration  procedure  to 
construct  levels  of  decreasing  resolution,  (ii)  a  discretization  mapping  from 
each  level  of  resolution  onto  a  finite  dimensional  vector  space  and  (iii)  a 
reconstruction  mapping  that  is  a  right  inverse  of  the  discretization.  We  detail 
each  item  of  the  above  list.  The  definition  of  the  first  two  items  is  very  closely 
related.  For  a  complete  set  of  details,  the  reader  is  referred  to  [5] 


Discretization  We  assume  that  we  are  given  a  sequence  {Ci)  of  control 
volumes  that  are  non  overlapping.  We  set  V  :  Ll(D)  t — >  RN‘  defined  by 
T>i(f)  =  (  A  (Ci),  f) .  In  the  next  paragraph,  the  sequences  of  control  volume 
clusters  {{£?j}}f=i— £  are  labeled  by  l,  and  Dl  will  refer  to  the  discretization 
defined  with  the  cells  {C|}  for  one  level  l. 

For  numerical  purposes,  it  is  essential  that  if  one  knows  the  representation 
of  /  on  one  level  l,  one  will  know  its  representation  on  the  coarser  levels,  i.e. 
for  indices  smaller  that  l.  This  nestedness  defined  by  Harten  can  be  stated 
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formally  as: 

Dl(f)  =  0  implies  Dl~1(f)  =  0.  (10.1) 

Since  the  discretization  operator  is  known,  every  things  will  rely  on  the  way 
the  cells  Cl  are  constructed. 

An  agglomeration  procedure  We  wish  to  construct  L  >  0  levels  of  dis¬ 
cretization.  We  rename  the  control  volumes  Ci  defined  on  T(O)  by  Cf ,  there 
are  Nl  =  N  such  control  volumes.  We  set  CL  =  {Cf'}j=i,Ari.  Assume  that 
Cl,  1  <  l  <  L  is  known.  If  {if, •  •  ■  ,1 lNi  i }  is  a  partition  of  {1,  •  •  • , Ni},  we 
set 

C*_1=U jeT'C)  (10.2) 

and  Cl~l  =  {C[~\  ■  ■  • ,  C^J.  Clearly,  =  12  and  Clrx  nC'"1  =  0 

if  i  ^  j,  since  the  C\  are  assumed  to  be  open.  Thanks  to  (10.2),  the  nestedness 
property  (10.1)  is  true.  In  fact  an  explicit  calculation  shows  that 

v“r'w  =  £  A-Mn- 

jell  |C*  1 

This  obvious  equality  enables  one  to  get  knowledge  of  the  discretization  of  / 
at  any  level  l  <  L  without  knowing  /  explicitly,  provided  T>L(f)  is  known. 
Now  the  key  issue  is  the  definition  of  the  Xjs. 

If  the  control  volumes  were  squares,  like  on  a  cartesian  mesh,  the  obvious 
procedure  would  be  to  gather  four  cells  provided  they  share  a  common  cor¬ 
ner.  This  is  what  is  done  in  multigrid,  or  in  domain  decomposition  methods 
(except  here  we  are  likely  to  have  many  subdomains,  depending  on  the  level 
of  resolution). 

In  the  present  context,  the  same  principles  have  been  applied.  In  [5],  we 
have  used  the  agglomeration  procedure  described  in  [14]  initially  derived  for 
multigrid  acceleration;  it  has  been  used  for  the  numerical  examples  of  this 
section.  For  the  flow  simulations,  we  have  preferred  to  use  a  recursive  domain 
decomposition  algorithm  [26]  because  it  enabled  us  to  have  a  better  control 
of  the  number  of  agglomerated  cells,  and  a  better  control  of  their  shape. 


Reconstruction  Once  the  discretization  of  /  is  known  at  level  l,  we  need 
a  reconstruction  of  /,  i.e.  we  need  to  find  a  function  Ri{f)  6  I1  (17)  such 
that  Vl[Ri(f)\  =  Vl(f).  We  have  chosen  to  look  for  a  piecewise  polynomial 
function  of  degree  r  (=  2  in  practice)  that  is  defined  locally,  for  each  cell  C\ . 
Since  Ri  is  a  right  inverse  of  Dl,  the  particular  choice  of  the  discretization 
imposes  (  A  ( Ci),Ri(f ))  =  (  A  ( C*),/ )  but  this  is  the  only  constraint.  Any 
other  recovery  procedure  might  have  been  employed,  provided  this  conserva¬ 
tion  constraint  is  true.  In  particular,  the  recovery  procedure  might  be  non 
linear,  or  it  might  have  used  non  polynomial  functions  as  suggested  by  [36] 
and  sketched  in  §8. 
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This  shows  that  the  suitable  reconstruction  technique  should  be  the  same 
as  the  one  we  have  used  for  the  ENO  methods  in  §4.  The  only  remaining 
question  is  how  to  define  the  stencil  S-.  This  is  achieved  by  an  heuristic 
procedure  inspired  from  [1,3]  as  described  in  §6.  However,  we  only  need  one 
stencil  per  cell  C\  instead  of  several  as  in  §6. 

First  we  identify  each  cell  C\  with  its  center  of  gravity.  Thanks  to  this,  the 
method  is  not  restricted  to  control  volumes  generated  by  triangular  meshes, 
because  at  this  level  we  may  forget  the  origin  of  the  control  volumes.  Second, 
we  build  a  Delaunay  mesh1  on  these  points,  and  remove  the  spurious  triangles 
that  lie  outside  the  original  domain  Q.  More  precisely,  we  say  that  a  triangle 
lies  outside  the  domain  if  its  centroid  is  not  in  the  domain.  This  can  be 
checked  in  practice  with  the  help  of  an  efficient  sorting  tool.  Then,  we  also 
remove  the  triangles  that  are  on  the  boundary  of  this  new  mesh  and  are  too 
flat.  Once  this  is  done,  we  construct  the  stencils:  for  each  triangle  (A,  B,  C)  of 
this  mesh  (i.e.  for  a  set  of  3  cells  at  level  l ),  we  add  the  three  other  points/cells 
shown  in  Figure  13.10.  More  details  can  be  found  in  [3]. 

Data  compression  Once  all  this  is  done,  we  can  consider  the  iV/+1  er¬ 
rors,  defined  for  each  cell  C‘,  by  e[+1  =  (  A  (C-+1  ),Ri(f))  -  T)[+1(/).  By 
construction,  we  have  ]Cigi‘.+1  |Cj+1  jej+1  =  0. 

We  have  N;  such  linear  relations  between  the  errors  at  level  l  +  1,  thus 
we  can  define  JVj+i  -  IV)  independent  scale  coefficients  d\+: ,  by  the  following 
computation:  for  each  j  <  Ni,  we  set  d*+1  =  e*+1  for  all  i  €  X*+1  except  the 
last  index,  and  we  set  dl+l  =  (d[+} ,  •  •  •  ,d1^  Nl).  It  can  be  shown  that 

VL(f)  =  ((  A  (Ct ),/),-,(  A  (CkL),  /»  fl0 

is  a  (linear)  one-to-one  mapping. 

Moreover,  from  Theorem  1,  if  /  admits  a  p-th  continuous  derivative,  then 
d\  =  0(hf+1)  (hi  is  a  characteristic  size  of  the  C-),  provided  the  mesh  is 
regular  enough.  This  remark  enables  one  to  represent  VL(f)  within  a  given 
tolerance  e,  with  less  than  Nl  degree  of  freedom.  To  do  so,  we  replace  the 
scale  coefficients  in  (10.3)  by  truncated  ones, 

,_{d\  if|<4i>£  =  e* 

1  \  0  else. 

More  sophisticated  expressions  for  e*  can  be  used,  but  it  does  not  affect  that 
much  the  compression  factor  fi  defined  as  the  ratio  between  Nl  and  Ni  plus 
the  number  of  non-zero  d\, 

= _ Nl _ 

M  N1+Zl=2,NL#{\dlj\>eiy 

1  A  triangulation  is  a  Delaunay  triangulation  if  no  point  of  the  triangulation  lies 
within  the  outer  circles  of  each  of  the  triangles. 
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10.3  Numerical  examples 

The  power  of  the  method  is  demonstrated  by  means  of  a  piecewise  smooth 
function  on  a  complex  geometry  which  looks  like  what  is  shown  in  the  upper 
left  part  of  Figure  13.11.  In  the  upper  part  of  the  figure  one  sees  cell  averages 
on  a  sequence  of  coarser  grids.  The  two  plots  in  the  lower  part  are  isolines  of 
truncated  scale  coefficients,  one  plot  for  the  restriction  from  the  fine  to  the 
medium,  one  for  the  restriction  from  the  medium  to  the  coarse  grid.  If  the 
reconstruction  starts  from  the  coarsest  grid  and  uses  only  the  non-zero  scale 
coefficients,  then  the  cell  averages  on  the  finest  grid  as  shown  in  the  upper 
left  part  are  recovered  within  plot  accuracy. 

In  Table  13,  with  the  entry  /2,  we  represent  the  tolerance  e,  the  compres¬ 
sion  factor  fj,,  the  L°°-  and  L1  -error  for  this  particular  function.  The  same 
information  is  given  for  f\  =  cos  2ti(x2  +  y2).  In  [5],  other  examples  and  de¬ 
tails  are  presented.  In  particular,  we  try  to  quantify  what  we  loose  by  using 
unstructured  meshes,  compared  to  Cartesian  ones. 

They  all  indicate  that  our  method  is  stable  and  has  the  same  accuracy 
on  structured  and  unstructured  meshes. 


10.4  Multiresolution  analysis  and  ENO  schemes 

In  the  ENO  method  of  §9,  the  key  point  is  the  use  of  a  piecewise  polynomial 
reconstruction,  the  same  as  here,  and  a  stencil  selection  procedure.  Then,  the 
MUSCL  method  is  applied,  with  an  ENO  reconstruction  on  the  physical  vari¬ 
ables.  Nevertheless,  this  is  costly.  The  previous  concepts  can  help  to  reduce 
significantly  the  number  of  ENO  reconstructions.  The  idea  is  to  use  a  two- 
level  multiresolution  scheme.  Only  one  set  of  scale  coefficients  is  produced 
and  we  modify  the  ENO  reconstruction  as  follows:  for  each  fine  cell  Cf  and 

I  j2t 

physical  variable  /*,  if  jyj-y  <  e,  we  use  the  reconstruction  of  /  on  the  coarse 
level  in  the  MUSCL  method,  else,  we  apply  the  ENO  algorithm.  The  other 
details  of  the  scheme  remain  the  same. 


10.5  A  numerical  experiment 

We  have  applied  this  simplified  ENO  scheme  to  various  configurations:  the 
interaction  of  a  shock  with  a  90°  wedge,  of  a  shock  with  a  ramp  and  of  a  shock 
and  a  vortex  with  a  ramp.  Here,  we  illustrate  the  method  on  the  interaction  of 
a  shock  and  a  30°  ramp.  The  shock  Mach  number,  evaluated  from  the  post¬ 
shock  conditions,  is  5.5.  The  fine  mesh  has  33943  points  and  67224  triangles, 
the  coarse  level  is  made  of  10000  cells.  Figure  13  shows  the  Mach  number 
isolines.  Most  of  the  known  structures  of  this  double  Mach  reflection  are  well 
represented,  in  particular  the  slip  line  out  of  the  triple  point,  as  well  as  the 
vortex  that  results  in  the  interaction  of  the  weak  shock  out  of  the  reflected 
one  and  this  slip  line. 
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Figure  13.13  shows  the  isolines  of  the  ENO-indicator  for  the  density  (all 
the  error  indicators  look  similar,  except  for  the  slip  lines  where  nothing  is 
detected  for  the  pressure).  Here,  e  =  10-2,  and  only  14%  of  pure  ENO  re¬ 
construction  was  done  at  this  stage  of  the  computation.  The  compression 
factor  is  always  larger  than  3/4  of  its  maximum  possible  value  Nl/Ni.  The 
simplified  ENO  scheme  runs,  in  this  case,  2.5  times  faster  than  the  pure  ENO 
one.  On  the  other  examples,  the  ratio  was  about  2-2.5.  This  ratio  is  clearly 
problem  dependent.  The  most  expensive  part  of  the  scheme  is  the  flux  evalu¬ 
ations  (Roe’s  scheme  here),  while  the  reconstruction  cost,  overhead  included, 
becomes  almost  negligible. 


11  Other  applications  :  Hamilton  Jacobi  equations 


Eno  schemes  have  been  applied  to  other  problems,  in  particular  the  approx¬ 
imation  of  the  Hamilton  Jacobi  equations 

Qu 

—  +H(x,  u,  Vu)  =  0  0 

u(x,  0)  =  uq(x)  xeQ,t  =  0  (11-1) 

Boundary  conditions  . 


In  (11.1),  u  :  fi  x  R+  — >  R  where  Q  is  an  open  subset  of  Rw.  Here,  N  =  2 
but  this  does  not  change  anything  to  the  discussion.  The  boundary  conditions 
can  be  of  the  Dirichlet  type  for  example,  but  this  point  will  not  be  discussed 
here,  see  [45,41]  for  details. 

The  existence  and  uniqueness  of  the  viscosity  solution  of  the  Cauchy  prob¬ 
lem  (11.1)  is  discussed  in  [46,45]  and  the  reference  therein.  Our  purpose  is  to 
discuss  some  elements  the  numerical  approximation  of 


—  +  H{Vu)  =  0xeR2,t>0 

C/C 

u(x,0)  =  u0(x)  i£l2,f  =  0 


(11.2) 


with  a  triangular  unstructured  mesh.  More  details  are  given  in  [42],  general¬ 
isation  to  (11.1)  is  rather  obvious  via  [41]. 

In  [44,47],  several  numerical  upwind  schemes  have  been  constructed.  They 
rely  on  the  strong  formal  analogy  between  the  viscosity  solutions  of  (11.2) 
and  the  weak  solutions  of 


dW  dH(W)  _ 

~dT+dx  =0  x  e  1  >  0 

W (x,  0)  =  Vxuq(x)  x  e  K2 ,  t  =  0. 


(11.3) 


From  (11.3),  any  reasonable  numerical  scheme  for  conservation  law  should 
give  rise  to  a  numerical  scheme  for  (11.2)  :  Godunov,  Lax  Friedrichs,  ect.  In 
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[47],  only  the  case  of  regular  Cartesian  grid  was  considered.  Any  of  the  proofs 
could  be  applied  even  to  a  non  orthogonal  structured  mesh.  Their  work  has 
been  generalised  in  [42]  and  error  estimates  are  provided. 

We  consider  schemes  writting  like 

<+1  =  u?-AtH(VTl  «",•••  ,VTfciO.  (11.4) 

In  (11.4),  the  set  {T\,  ■  ■  • ,  T*. }  is  the  set  of  triangles  that  share  M,  as  vertex, 
the  quantities  are  Vt,u”  are  numerical  gradients  of  the  node  values  u".  The 
scheme  is  formally  first  order  when  Vt,u"  is  the  gradient  of  the  (continuous) 
piecewise  linear  interpolation  at  the  vertices  of  the  mesh. 

It  is  shown  that  a  first  order  monotone  scheme  that  has  the  additional 
property  of  beeing  “intrinsic”  is  convergent,  and  one  has  the  following  error 
estimate 

max  |tt"  —  u(Mi,tn)\\  <  C(  Mesh  ,T,H,uo)Vii 

M{,n€N 

where  M,  is  a  generic  mesh  point,  tn  =  nAt ,  the  constant  C  depends  on 
standard  regularity  properties  of  the  mesh,  the  Lipschitz  constant  of  H  and 
uq.  The  parameter  h  is  the  maximum  of  the  diameters  of  the  circumscribed 
circles  of  the  triangles.  The  proof  does  not  depend  on  N  =  2,  the  dimension 
of  N.  By  saying  that  a  numerical  Hamiltonian  is  “intrinsic”,  we  mean  the 


Fig.  11.1.  The  neighboring  triangles  of  M; 


following  property  :  take  any  triangle  T,  cut  it  in  two  parts  as  on  Figure  11.2, 
then,  since  u  is  assumed  to  be  linear  in  T,  the  gradient  of  u  is  the  same  in  Tcut 
and  T'cut.  The  number  of  arguments  in  H  is  increased  by  one,  but  two  among 
them  are  the  identical :  they  are  Vru-  The  numerical  Hamiltonian  is  intrinsic 
if  the  value  of  %  is  not  modified.  This  is  obviously  true  for  the  Godunov 
solver,  and  appropriate  weights  enable  to  have  the  same  property  for  the 
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Lax-Friedrichs  one.  This  property  is  usefull  in  the  proof  :  the  fundemental 
difference  between  an  unstructured  mesh  and  a  Cartesian  one  is  that  the 
mesh  is  not  invariant  by  translation.  The  “intrinsic”  property,  in  some  sense, 
replace  the  invariance  by  translation  one.  Note  that  a  structured  mesh  is  not, 
in  general,  invariant  by  translation. 


Fig.  11.2.  Illustration  of  the  intrinsic  property 


To  increase  the  accuracy  of  the  scheme,  any  of  the  previous  ENO  or 
WENO  techniques  can  be  applied.  The  algorithm  is  : 

1.  given  ,  compute  an  ENO/ WENO  Lagrange  reconstruction  within 

each  triangle  with  the  algorithm  of  §6.  We  call  7ryu"  the  reconstruction 
in  a  generic  triangle  T, 

2.  Given  any  node  Mj,  consider  the  set  {Ti,  •  ■  ■  ,Tk{}  and  compute 

Vti^ un{Mi),  ■  •  • ,  Vrfc(  ^Tk{ 

the  gradients  node  values  at  the  vertex  Mi 

3.  Compute  the  numerical  Hamiltonian  with  these  arguments, 

4.  Take  any  Runge  Kutta  scheme,  for  example  the  TVD  ones  above,  to 
update  the  solution. 

The  ENO  algorithm  has  been  successfully  used  in  [42]  to  compute  the  ray 
paths  through  a  lense,  or  for  a  Geophysical  problem  [43]. 

12  Schemes  with  adaptive  limiters  and  fluxes 

The  simulations  of  severe  flow  conditions,  such  as  in  the  reactive  flows,  re¬ 
quire  robust  numerical  methods.  Many  computations  use  a  class  of  algorithms 
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based  either  on  flux  vector  splitting  (FVS)  or  on  flux  difference  splitting 
(FDS).  Liou-Steffen  [49]  have  proposed  a  remarkably  simple  upwind  FVS. 
This  splitting,  called  AUSM,  treats  the  convective  and  pressure  terms  sepa¬ 
rately.  The  convective  quantities  are  upwind-biased  extrapolated  to  the  cell 
interface  using  a  properly  defined  cell  face  advection  Mach  number.  AUSM 
keeps  the  qualities  of  FVS  (robustness  and  efficiency)  and  recovers  the  accu¬ 
racy  attributed  to  FDS.  Radelspiel-Kroll  [50]  proposed  several  modifications 
in  order  to  improve  the  scheme’s  ability  to  solve  viscous  flows  correctly.  In 
particular,  this  includes  a  switch  from  AUSM  to  van  Leer  flux  splitting  (VL) 
through  strong  shock  waves.  In  this  Vsection,  we  retain  this  idea  but  we 
use  it  differently  [59].  The  switch,  here,  is  related  with  the  local  accuracy 
of  the  scheme.  When  the  scheme  degenerates  into  a  first-order  one  (outside 
a  shock  wave),  it  is  convenient  to  use  AUSM;  and  when  the  scheme  is  a 
second-  or  third-order,  VL  is  better  in  order  to  minimize  the  error  terms. 
To  capture  strong  and  (or)  rapid  physical  fluctuations  accurately,  the  local 
variation  of  each  quantity  has  to  be  incorporated  as  much  as  possible  in  the 
writing  of  the  scheme.  ENO  schemes  choose  the  stencil  which  provides  the 
most  regular  solution  in  order  to  minimize  numerical  over  and  undershoots. 
In  this  method,  we  take  the  stencil  which  minimizes  the  numerical  error  terms 
(dissipative  and  dispersive  terms).  These  terms  have  different  expressions  fol¬ 
lowing  the  local  evolution  of  quantities.  To  improve  efficiency,  the  equivalent 
system  (ES)  needs  to  be  studied,  including  the  expression  for  the  slope  lim¬ 
iters.  Their  expressions  are  controlled  by  the  local  but  also  by  the  environing 
physical  variation  of  the  quantities.  For  each  quantity,  six  different  cases  are 
considered,  each  associated  with  a  different  physical  variation.  A  triad  of 
limiters  is  defined  which  minimizes  or  cancels  the  second-order  truncation 
errors.  From  this  study,  a  new  explicit  scheme  is  written.  Compared  with  a 
standard  TVD-MUSCL  scheme,  it  is  no  more  complicated  and  it  gives  a  more 
precise  solution.  It  is  applied  to  the  1-D  test  case  proposed  by  Shu-Osher  [51] 
to  simulate  the  interaction  between  a  moving  shock  wave  and  a  turbulent 
flow.  The  results  show  that  we  obtain  the  same  precision  as  with  their  ENO 
scheme.  The  improved  accuracy  is  also  demonstrated  by  the  computations  of 
a  2-D  supersonic  jet. 

12.1  MUSCL  Approach  and  Flux  Splittings 

The  hyperbolic  part  of  the  conservation  form  of  the  1-D  Navier-Stokes  equa¬ 
tions  is  classically  written: 

Wt  +  Fx=  0  with  W  =  [ p ,  pu,  pE]T ,  ^ 

W{x,0)  =  W°{x),  —  oo  <  x  <  +oo,  t>0  '  ' 

where  p,  u  and  E  are  the  density,  the  velocity  and  the  total  energy.  In  the 
discrete  form,  (12.1)  is  expressed  as: 


(12.2) 
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with  Wj  =  w;  =  W(Vf),  a  =  At/ Ax, and  Ej+i  =  HV^V^V^V^). 
Note  that  W  are  the  conserved  variables,  the  ffux  T  may  not  be  expressed 
in  terms  of  W  but  in  terms  of  new  variables  V .  This  method  is  devoted  to 
improve  the  numerical  simulation  of  compressible  mixing  layers,  with  possibly 
chemical  reaction,  because  of  that,  we  have  chosen  V  =  (p,  u,  T)t  where  T 
is  the  temperature.  Note  that  all  the  limitation  procedures  are  performed 
on  the  variables  V.  The  numerical  T  verifies  T{V, ...,  V)  =  F{V)  (Note  we 
write  it  in  termes  of  the  ^-variables).  Ax  is  assumed  to  be  constant  and  At 
is  related  to  Ax  by  a  CFL  condition.  With  the  MUSCL  approach  [52],  the 
backward  and  forward  extrapolated  values  of  Vj+ 1  at  the  interface  j  +  |  can 
be  written  as: 

Vj+1/2  =  HV^Vj+uVi)  and  V£1/2  =  R(Vj,Vj+ 

At  the  interface  j  —  1/2,  we  have: 

V3-i/2  =  and  V*l/2  =  RiVj-uVj,^), 

where  <piand  tp2'R  are  non-linear  functions  of  rj  with  rj  =  (compo¬ 

nent  by  component).  The  non-linear  interpolations  L  and  R  have  to  verify  the 
following  properties:  homogeneity,  translation  invariance,  left-right  symme¬ 
try,  monotonicity  and  convexity.  The  flux  is  written  in  the  general  form 

FJ+ 1  =  F(V}L+  1_,VRi)  -  $AG  where  $AG  =  $  [G(V£  *)  -  G(Vfaj\  is  a 
dissipation  term.  We  are  more  particularly  interested  in  FVSvl  and  AUSM 
schemes.  As  in  [50]  we  couple  both  schemes;  but  our  coupling  is  different.  It 
is  based  on  an  analysis  of  the  ES  and  takes  account  of  the  properties  of  each 
scheme  at  the  first  and  second-order.  This  coupling  is  advantageous  because 
the  expression  for  the  fluxes  is  both  very  similar  and  yet  exhibits  different 
properties.  In  the  case  of  a  perfect  gas,  with  the  constant-pressure  specific 
heat  Cp  and  the  specific  heat  ratio  7  assumed  to  be  constant,  if  we  define 


then  these  both  splittings  can  be  written,  at  the  grid  point  j  +  |  and  for 
— 1  <  M  <  1,  as  follows 
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-  The  FVSkl  scheme: 

F  =  KSFCDS  +  F[s 


F+Fcds(Vl)  +  FmFcds(Vr)  +  F+  +  F-  (12.3) 


=  0 


-  The  AUSM  scheme: 

f  =  (f++f^Y-^v»)+f+  +  f- 

§  =  \M\  and  AG  =  \  [Ff  s(Ffi)  -  FCDS(VL)] 


(12.4) 


where  c,  p  and  H  represent,  respectively,  the  sound  speed,  the  static  pressure, 
and  the  total  enthalpy. 

The  analysis  of  (ES),  obtained  from  Taylor  expansions,  quantifies  the 
truncation  error  of  the  discrete  form  as  Ax  and  At  0.  V  and  F  are  as¬ 
sumed  to  be  analytic  functions.  For  each  component  V* ,  the  expansions  reflect 
the  environing  physical  behaviour  associated  with  the  specific  approach  used 
here.  Six  different  cases  are  considered  for  each  component  Wi  (Fig  13.14): 

—  No  extremum  at  j 

•  case  1  :  monotonic  evolution, 


•  case  2  :  extremum  at  the  nodes  j-1  and  j+1, 

•  case  3  :  extremum  at  the  node  j-1  or  j+1, 

-  Extremum  at  j 

•  case  4  :  no  extremum  at  the  nodes  j-1  and  j+1, 

•  case  5  :  extremum  at  the  nodes  j-1  and  j+1, 

•  case  6  :  extremum  at  the  node  j-1  or  j+1. 


12.2  First-order  Error  Terms  in  Space 

After  calculated  the  Taylor  expansions  of  r(V)  ,  ipa(r(V))  ( a  =  1,2)  and 
of  the  fluxes  <P  =  F(ipa(r(V)))  with  F  =  F^ ,  F^} ,  FDS  , ...  at  the  node  j  for 
both  cases  Vx  i=-  0  and  Vx  =  0,  (1)  is  transformed  into: 

Wt  +  Fx  +  Ac  [A]  Vxx  +  0(Ar2)  =  0  (12.5) 

where  [A]  is  a  (3, 3)  matrix.  The  first-order  error  term  in  space,  for  the  kth 
equation  (A;  =  1, 3)  of  the  system  (12.5)  and  for  the  splitting  (12.3),  can 
be  expressed: 


_  l 
~  2 


E 

i=p,u,T 


Erk  =  E  AkiVix» 


{fdsf+>  +  EZipL'  +  f Gf)  [pf ] 

-(FdsF~'  +  FFZFr'  -  f Gf')  [gf] 


Vi 
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For  the  AUSM  approach  (12.4),  the  expression  is  a  little  more  compli¬ 
cated: 


Erk  = 


1 

2 


E 


(FcdsF+>  +  ^Ff[  +  F+  +  f Gf)  [gf] 
~{F?sF-  +  ^F*  +  F-  -  f Gf)  [gf- ] 


Vi 


xx 


where  gL,R  =  gL’R(<p)  =  y’1^2  ^  +  Vi  ’2  ^  —  1  if  Vx  =  0  gL,R  =  gL,R(<p')  = 
<pfR(l)  -  <fi'  i(l)  if  K  #  0,  *>'  =  f ,  F+'  =  $£,  (F-)'  =  ( FL>R)'  = 

fc(G£’fi)'=^,e tc. 

In  order  to  develop  a  simple  analysis,  we  assume  at  node  j :  |  —  Vf  |  << 
max(|l^ii| ,  |VjL|),  ( i  =  1,...,3).  This  says  the  jump  at  the  interface  j  is 
considered  to  be  weak  (the  strong  discontinuities  are  excluded  of  the  analysis). 
The  case  | VR  —  Vf  |  «  max(|V)'R| ,  \VR\)  is  not  considered  in  this  paper, 
although  it  may  exist  in  the  velocity  under  certain  circumstances,  such  as 
when  this  quantity  has  strong  fluctuations  around  zero.  The  expansions  are 
calculated  for  positive  values  of  M.  The  expressions  for  M  <  0  are  obtained 
by  symmetry  (  gf  is  replaced  by  gf  and  reciprocally).  The  first-order  term 

r  d 

cancels  if  g{  ’  =  0  .  In  general,  we  assume  that 


for  r  <  0,  ipa  =  (p'a=0  and  then  g>i  =  <£i(-l)  =  0.  (12.6) 


Therefore,  the  first-order  error  term  cancels  if 

(pf  —  ipf  =  2;  for  r  =  3  (12-7) 

(if  there  is  an  extremum  at  node  j  or  if 

^'2  =  v'2  =  for  r  =  1  (12.8) 


if  there  is  no  extremum  at  node  j. 

The  Taylor  expansions  at  node  j  include  the  presence  of  one  extremum 
(cases  4-6)  or  none  (cases  1-3)  at  this  point.  On  the  other  hand,  they  do  not 
say  whether  one  extremum  exits  or  not  at  the  neighbors  j  —  1  and  j  +  1.  If 
there  is  no  extremum  associated  with  j  —  1  and  j  +  1  (cases  1  and  4),  no 
additional  constraint  appears;  but  if  an  extremum  is  present  at  these  points, 
then  either  a  different  definition  of  ip  (cases  2  and  3)  is  required  in  order 
to  preserve  the  second-order  accuracy  or  the  scheme  accuracy  automatically 
degenerates  (cases  5  and  6). 

If  Vix  ^  0  at  node  j,  the  condition  (12.8)  is  easily  met  if  the  nodes  j  —  1, 
j  and  j  +  1  have  no  extremum  for  component  V)  (case  1).  In  this  case,  it 
is  sufficient  to  take  the  same  function  in  the  second-order  TVD  domain  for 
each  point  j  —  1,  j,  j  + 1.  If  one  extremum  exits  for  one  or  both  neighbors  of 
node  j  (case  2  or  3),  the  condition  is  more  restrictive.  Since  we  have  ip'f  =  0 
and/or  ,  the  second-order  accuracy  is  ensured  only  if 

<A(i)  =  o. 


(12.9) 


42 


Remi  Abgrall  et  al. 


If  Vix  =  0  at  point  j,  the  condition  (12.7)  is  obtained  if  j  —  1  and  j  +  1 
are  not  associated  with  an  extremum  (case  4).  But  this  condition  is  no  longer 
met  if  there  exists  at  least  one  extremum  at  one  of  the  neighbors  of  j.  For 
these  cases  (cases  5  and  6),  the  first-order  error  term  does  not  cancel.  Case  5 
corresponds  to  local  phenomena  of  wave  length  2  Ax,  and  case  6  to  phenomena 
of  wave  length  2 Ax  or  3 Ax.  Therefore,  when  we  have  physical  variations 
with  wave  length  fluctuations  greater  than  3 Ax,  the  scheme  is  second-order 
in  space  if  the  expression  of  (p  is  well-defined.  When  wave  length  fluctuations 
are  smaller  than  or  equal  to  3 Ax  (cases  5  and  6)  the  first-order  error  term  is 
still  present.  In  this  case,  the  scheme  has  strong  dissipative  properties  that 
can  eliminate  the  numerical  instabilities. 

For  M  >  1  ,  the  error  terms  have  the  same  expression  whatever  the 
splitting;  but  for  M  <  1  ,  their  expression,  [A]  =  [A]fUSM  for  Liou-Steffen 
and  [A]  =  [A^l  for  van  Leer,  depends  on  the  splitting  chosen.  For  case  5, 
where  gf ,R  =  —1,  and  for  case  6  where  pf  =  —  1  and  gf  =  0,  the  error  terms 


are  written: 


[A]ausm  =  [A]fSM  +  {A]vaL , 

=  \a^l + (AirL ,  with 


[A)fSM  =  [A] 


=  [A]r+[A]va 

VL 
c 


and  [A^L  given  by 


w. = 


All  Ai2  Ai3 

uAn  uAi2  +  pAi i  uAi3 

\_HAn  HA12  4-  puAn  HA13  +  pCpAn 


where 


An  =  |M,  A12  —  |(Mi5(5  +  bSie), 


A13  =  jfMd6  for  [A] 

An  =  f  (a2di5  +  b2di6), 

A12  =  5  +  bSis), 


AUSM 

c  • 


Au  =  -^bdS  for  [A] 

„2  _  1+M2  h  _  1 +M 

u  2  > u  2  5 


VL 


d= 

a  2  J 


S  =  6l5  +  s-f, 


0  0 


£6  |t(Ai 2  +  f^6) 
0  0 
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<5/5  and  <5/6  are  the  Kronecker  symbol,  1  =  5  (case  5)  or  6  (case  6). 

The  error  term  induced  by  AUSM  splitting  is  always  smaller.  The  differ¬ 
ences  become  greater  when  M  -»■  0.  For  case  5,  for  example,  when  M  -»  0, 
many  components  of  [A]  cancel  with  AUSM  splitting  (Fig.  13.15).  From  this 
study,  we  deduce 

Condition  1:  the  AUSM  splitting  (12.4)  is  chosen  when  the  scheme  de¬ 
generates  into  a  first-order  scheme  (cases  5  and  6). 


12.3  Second-order  Error  Terms  in  Space 


In  this  section,  we  see  that  the  first  order  terms  of  (ES)  cancel  if  the  limiters 
satisfy  given  properties  at  some  specific  points  only.  The  second  order  terms 
only  remain  and  we  specify  explicitely  which  limiters  should  be  taken  so  that 
these  term  also  cancel.  The  system  (12.2)  has  the  following  expression: 

Wt  +  Fx  +  Ax2(BVxxx  +  CVXX  +  DVXX  4-  EVX)  =  0(Ax3)  (12.10) 

where 

B 
C 

C 
D 
E 

with 

-  If  Vix  ±  0  at  j,  xi  -  1  -  3<p'2  and  \2  =  1  -  Vi  ~  ¥>2- 

-  If  Vix  =  0  at  j,  xi  =  2+ipi-ip2  +  2ip[-4tp'2  and  X2  =  2  +  yq  -  <^2 

The  matrices  B,  C,  D,  E  are  provided  in  annex  A. 

By  homogeneity,  the  cancellation  of  C  for  cases  1-3  gives  the  following 
condition  on  ip"\  ip" i(l)  =  y>"f(  1)  =  ip" £(1)  .The  terms  EiVx  (i  =  1  (conti¬ 
nuity  eq.),2  (momentum  eq.),3  (energy  eq.))  come  from  fluxes  that  contain 
products  of  at  least  three  primary  quantities  (for  example  pu2, ...  ).  For  each 
equation,  the  error  terms  are  expressed  in  the  annex  A.  With  AUSM  and  VL 
splittings,  the  error  terms  are  the  same,  excepted  for  EiVx  and  the  dissipation 


=  B(xi,n 

=  C(ip"i,<p"£,<p"%  ,  V,  Vx)  if  Vi.  ^  0  at  node  j 
(  cases  1  -  3), 

=  0  if  Vix  =  0  at  node  j  (  case  4  ), 

=  D(X  2,V  ,14), 

=  E(V,VX) 
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term  in  the  energy  equation  : 

(E1Vx)AUSM  =  [-3pUx  _  cMpx  +  2 pcM(LogT)x] , 

(E1Vx)VL  =  0 


32 


\AUSM 


(Ea\ 4) 

16/9^1x3  +  4 cMpxux(LogT)x  -  4puxux(LogT)x+ 
c2M2px(LogT)2x  +  2pc2  M2  (LogT)l 


X 

32 


(■®2^/a:)  —  i2^Px^2x^'x)i 

(E3Vx)ausm  = 

20 cMpxuxux  +  1  6c2(C2  -  ^)PzUz(£o5T)a;+ 

2 pcMuxux(LogT)x  -  c3M(3C2  -  ^ )px{LogT)2x -  , 

pc2(3C2  +  Mi )ux(LogT)2x  +  %pc3M3(LogT)l 


(E3Vx')  —  g  [cilf pxiixux  "h  ^>CppxiixTx  4“  puxuxux\  , 

(d3vxx)vl  =  (d3vxx)ausm  +  f  [Cpdif  (p„r.).] , 

where  C2  =  -  1.  In  order  to  avoid  the  appearance  of  numerical  oscil¬ 

lations  corresponding  to  case  5  or  6,  it  is  better  to  eliminate  the  dispersive 
error  term  BVXXX.  Although  these  oscillations  are  damped  by  the  scheme, 
as  we  have  seen  in  the  previous  paragraph,  it  is  harmful  to  drop  the  scheme 
accuracy  artificially  if  this  is  not  necessary  Therefore,  for  cases  1  and  4,  we 
let  xi  =  0  ■  Applying  conditions  (12.7-12.9),  we  have 

Pa(l)  =  ri(i)  =  5  if  Vi.  #0  (12.11) 

at  nodes  j  —  1,  j  and  j  +  1  (case  1)  and 

¥4(3)  =  0  if  V5.  =  0  (12.12) 


at  node  j  (case  4). 

For  cases  2  and  3,  as  the  constraints  (12.8-12.9)  are  already  imposed, 
we  have:  xi  —  1-  Therefore,  BVXXX  does  not  cancel  for  these  cases.  But  in 
fact,  it  is  possible  to  eliminate  the  dispersive  term  if  we  use  a  multi-time 
stepping  scheme  and  if  we  apply  different  expressions  of  tp  at  every  time  step 


High  Order  Approximations  for  Compressible  Fluid  Dynamics 


45 


(not  presented  here).  From  conditions  (12.7-12.9)  and  (12.11-12.12),  it  is 
possible  to  define  an  adequate  limiter  under  the  form  of  a  triad  of  limiters, 
each  adapted  to  the  local  variation  of  the  physical  quantities.  If  at  node  j  we 
have: 

-  case  1,  we  take  ([53]) 


'  r±2 
3 

2 

if  -  <  r  <  4 

5 

<Pl  =  <P2  =  <P  =  < 

2 

2r 

if  r  >  4 

2 

if  0  <  r  <  - 
5 

(12.13) 

cases  2  and  3,  we  choose 

.0 

1 

if  r  <  0 

if  r  >  j- 

H>\  =  <P2  =  =  • 

2r 

.0 

if  0  <  r  <  - 

if  r  <  0 

(12.14) 

case  4,  we  define 

II 

S- 

II 

£ 

ip  = 

superbee  5 

(12.15) 

-  cases  5  and  6,  ip  has  to  verify  only  the  constraint  (12.6).  It  is  easy  to  see 
that  the  triad  (12.13-12.15)  verifies  this  condition. 


From  (12.13-12.15),  X2  =  |  for  case  1,  X2  =  1  for  cases  2  and  3  and  X2  =  0 
for  case  4.  If  all  the  components  V*  have  an  isolated  extremum  at  j  (case  4) 
at  the  same  time,  the  second-order  dissipative  error  terms  vanish.  For  this 
case,  EiVx  vanishes  too.  The  scheme  is  then  third-order  accurate  in  space. 
For  the  second-order  error  terms,  the  main  difference  between  AUSM  and 
VL  splittings  is  in  the  expression  of  terms  EiVx  .As  long  as  the  temperature 
gradients  are  weak,  these  terms  can  be  neglected.  But  for  reactive  flows,  their 
values  become  unnegligible  and  the  choice  of  the  splitting  becomes  impor¬ 
tant.  For  this  kind  of  flow,  it  is  better  to  take  VL  splitting  because,  with 
it,  the  expressions  for  EiVx  are  much  simpler  and  remain  the  same  as  those 
associated  with  the  supersonic  flow.  Their  values  are  weaker  too.  Therefore 
Condition  2:  when  the  scheme  remains  second-order  or  third-order  (cases 
1-4),  it  is  recommended  to  use  VL  splitting  (12.3). 

Although  simpler  with  this  splitting,  the  EiVx  terms  are  not  automatically 
negligible  in  particular  when  the  temperature  variations  become  high.  So,  it 
is  to  our  advantage  to  eliminate  these  terms,  which  appear  as  additional 
transport  terms  in  the  conservation  equations: 

Pt  4-  ( pu)x  +  Ax2 (BiVxxx  +  Vxx)  =  0{A: r3) 


(pu)t  +  ( pu 2  +p)x  -I-  (8p8u)jf  -I-  Ax2(B2  Vxxx  +  D2Vxx)  =  0(Ax3) 
(. pE)t  +  ( puH)x  +  8(pH )^  +  Ax2(3  Vxxx  +  D%VXx  =  0(Ax3) 


46 


Remi  Abgrall  et  al. 


where 

6(pH)  =  3Cp5p5T  +  uSpSu  +  pSu5u 

and  SVi  represents  the  variation  of  V{  on  the  mesh  size  Ax.  Formally,  the 
residual  error  terms  can  be  corrected  by  adding  the  opposite  value  to  the 
expression  for  the  fluxes.  For  example,  by  defining  the  flux  at  the  interfaces 
j  +  |  and  j  —  \  in  the  following  form: 

^+1/2)  +  F+(V}L+  1/2)  -  SQj Vj+1/2, 

F-{V*  1/2)  +  F+(Vf_1/2)  -  SQjVj _1/2, 

cM  =  C—^{F+  +  f£),  cL*  =  C{UL <R), 

0 

6p5u 

12  , 

3C p  5  p6T-\-uS  p8u-\-  p5u6u 

L  8  -I  j 

where  5Vj  =  hj+1/2  —  ^'-1/2  >  the  terms  EiVx  disappear.  This  correction  is 
activated  only  if  the  scheme  remains  a  second  or  third-order  scheme  in  space. 


tjFS 
*3+ 1/2 
pFS 

*j- 1/2 


^■+1/2  - 
SQj  = 


12.4  Multi-time  stepping  algorithm 

The  analysis  of  the  previous  sections  was  based  on  the  hypothesis  of  a  single 
time  step.  When  we  use  a  multi-time  stepping  scheme,  two  questions  come 
to  the  mind: 


-  What  is  the  effect  of  a  multi-stepping  scheme  on  the  spatial  error  terms 
and  on  the  CFL  criterion? 

-  What  is  the  minimum  number  of  time  steps  needed  to  achieve  sufficient 
accuracy? 

For  the  particular  second-order  scheme  in  time: 


Vj  =  Vj  -  <j{Tj+i/2  -  Fj-i/i) 

vp+1  =  I  [(Vj  +  Vj)  -  a(Fj+1/2  -  Fj-1/2)\ 


(12.16) 


we  show  that  if  we  choose  the  same  limiters  (<p  =  <pi  =  <p2  for  the  predictor 
stage  and  (p  =  (pi  =  (p2  for  the  corrector  one)  for  each  case  considered,  the 
spatial  error  terms  are  the  same  as  those  generated  by  a  single-time  stepping 
scheme.  But,  a  more  restrictive  condition  on  the  CFL  number  is  introduced 
in  order  to  the  scheme  (12.16)  verifies  (12.6)  and  remains  second-order  in 
space  at  a  node  j  where,  for  one  or  several  components,  Vix  =  0  and  Vixx  ±  0 
(equation  12.7): 

<p(— a)  =  <^(-l/a)  =  0  and  (p(b )  =  ip(c )  —  2  with  a  =  1  +  4aFixx /(Vixx  - 
2 vFj.,),  1/a  =  1  -  4aFixx/(Vixx  +  2 *Fimm),  6  =  1  +  2 VixJ(Vixx  -  2 aFimm), 
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c  =  1  +  2VixJ{Vixx  +  2aFixx),  then 


Vi. 


This  inequality  associated  with  the  classical  condition  gives  a  new  condi¬ 
tion  on  the  local  time  step  Atj  : 


Atj  <  min 


2  Ax 


Ax 


3  maxjAj  | 5  2max| 


where  the  values  of  the  second  deriva¬ 


tives  act.  A i  represent  the  eigenvalues  of  the  Jacobian  A  of  F.  At  has  to  verify 


At  <  minj  Atj. 


For  example,  with  the  scalar  non-linear  equation  (inviscid  Burgers  equa¬ 
tion),  the  condition  on  Atj,  associated  with  an  extremum  (Vx  =  0  and 

Vxx  7^  0),  is  the  most  restrictive  since  |Aj|  =  u  and 


Fix 


Vimm 


=  u  and  therefore 


Af.  <  As. 

--  2 u  ■ 

This  result  has  not  been  generalized  to  a  scheme  with  three-time  step¬ 
ping;  but  it  would  seem  that  the  spatial  error  terms  would  be  still  the 
same  with  probably,  a  new  condition  on  the  CFL  number.  Knowing  Wttt  = 
—  (F'w(F'wW x))  [54],  we  can  write  the  condition  (ES): 


Wt  +  Fx 


-(A3VX)X 


+  2nd  order  spatial  error  terms  =  0(Ax3,At3). 

The  time  error  terms  are  similar  and  of  the  same  order  as  the  spatial 
error  terms  when  the  CFL  number  is  close  to  unity  and,  what  is  essential, 
there  is  theoretically  no  way  of  canceling  or  even  controlling  them.  That  is 
to  say,  if  we  keep  this  accuracy  in  time,  all  the  effort  devoted  to  the  spatial 
discretization  in  order  to  control  the  error  terms  will  become  useless.  Since  all 
these  terms  cannot  be  eliminated  easily  (they  are  still  more  complicated  than 
the  spatial  error  terms),  it  is  recommended  that  a  higher  order  scheme  in  time 
(at  least  third-order)  be  applied  in  order  to  remove  them  automatically.  The 
second-order  and,  even  more,  the  first-order  error  terms  are  then  controlled 
by  the  spatial  discretization.  For  example,  we  can  use  the  third-order  scheme 
in  time  defined  in  [55]. 


12.5  Applications 

In  1-D,  the  example  proposed  by  Shu  and  Osher  [51]  is  interesting  because  it 
uses  the  Euler  equations  to  simulate  the  interaction  between  a  moving  Mach 
3  shock  and  a  turbulent  flow  represented  by  sine  waves  in  density.  The  initial 
conditions  are  described  as: 

-  p  =  3.857143,  u  =  2.629369,  p  =  10.33333  if  x  <  -4, 

-  p  =  1+0.2  sin  5x,  u  =  0,  p  =  1.  if  x  >  —4. 
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As  in  [51],  the  CFL  number  is  equal  to  0.5  and  the  final  time  is  t  =  1.8.  Since 
the  exact  solution  for  this  problem  is  unknown,  the  solid  line  representing 
the  numerical  solution  with  1600  cells  is  assumed  to  be  the  exact  solution. 

Figs.  13.16a,  b  and  c  show  the  solution  of  the  density  field  with  400 
cells  and  the  limiters  minmod,  superbee,  and  <p  =  (r  +  2)/3,  applied  sepa¬ 
rately.  The  limiters  minmod  and  superbee  give  middling  solutions.  If  AUSM- 
VL  splittings  (12.3-12.4)  with  the  selected  triad  (12.13-12.15)  are  applied 
(Fig.13.16d),  the  solution  is  comparable  with  that  of  the  third-order  ENO 
scheme.  In  particular,  the  high  frequencies  are  well-represented  and  the  com¬ 
pression  waves  and  the  shock  are  well-captured.  If  the  nodes  where  (outside 
the  shock  wave)  the  scheme  degenerates  to  a  first-order  scheme  in  space  are 
plotted  (Fig.  13.17),  we  see  that  it  degenerates,  not  in  the  regions  of  strong 
fluctuations  but  rather  in  the  relatively  quiet  regions;  that  is  to  say,  to  elim¬ 
inate  essentially  numerical  micro-oscillations. 

The  second  test  case  is  a  2-D  axisymmetric  supersonic  mixing  layer.  The 
inlet  conditions  are: 

-  central  jet:  M  =  1.74,  p  =  8  103  Pa,  T  —  200  K 

-  peripheral  jet:  M  =  2.,  p  =  8  103  Pa,  T  =  580  K. 

We  solve  the  Euler  equations  using  a  splitting  method.  The  2-D  finite  differ¬ 
ence  operator  is  split  into  a  product  of  simpler  operators  Un+2=  (LrLzLzLr)  Un 
where  Lr  and  Lz  are  hyperbolic  1-D  difference  operators  in  directions  r  and  z. 
The  computations  have  been  done  on  a  grid  mesh  of  351  nodes  in  z  direction 
and  93  nodes  in  r  direction.  CFL  number  is  equal  to  0.5. 

Fig.  13.18  a  shows  an  instantaneous  view  of  the  temperature  field,  with 
the  scheme  using  AUSM  splitting  and  the  limiter  tp  =  {r  +  2)/3,  and  Fig. 
13.18b  shows  the  same  view,  with  the  same  time-stepping  scheme  but  using 
the  conditions  1  and  2  for  AUSM-VL  splittings  and  the  triad  of  limiters.  The 
transitional  zone  (A)  shows  a  greater  sensitivity  of  the  scheme  proposed  here 
to  the  physical  instabilities.  We  can  also  see  the  very  weak  diffusion  of  the 
scheme  in  the  shear  layer.  In  the  growth  of  the  large  eddies  (B),  the  mixing 
in  the  core  of  the  eddies  is  more  detailed  with  the  method  presented  here. 

12.6  Summary 

This  section  shows  it  is  possible  to  improve  the  accuracy  of  TVD-MUSCL 
approach  if: 

-  the  accuracy  in  time  is  greater  than  the  accuracy  in  space, 

-  the  non-linear  functions  tp  are  expressed  in  a  triad  (13-15)  taking  into 
account  the  local  variations  of  each  quantity, 

-  AUSM  splitting  (12.4)  is  used  when  the  scheme  degenerates  into  a  first- 
order,  and  VL  splitting  (12.3)  is  applied  when  the  scheme  remains  second 
or  third-order. 
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Adding  to  the  basic  well-known  avantages  of  the  algorithm  proposed  herein, 
the  good  accuracy  of  the  numerical  solution  opens  new  perspectives  for  TVD- 
MUSCL  schemes.  In  particular,  Large  Eddy  Simulations  (LES)  and  Direct 
Numerical  Simulations,  already  performed  with  this  approach  in  [56], [5 7], 
[58],  [60]  for  example,  now  seem  to  come  within  the  field  of  application  of 
these  schemes.  Algorithmically,  the  correction  proposed  herein  is  easily  to 
implement  and  the  additional  time  consuming  is  very  small.  The  new  encod¬ 
ing  for  the  95-triad  is  written  in  annex  B.  New  improvements  are  possible 
by  using  different  expressions  of  95  at  each  time  step  of  the  time  integration. 
Simulations  of  freely  decaying  isotropic  turbulence  to  evaluate  the  reliability 
of  this  scheme  in  LES  and  a  study  of  first-order  and  second-order  scheme 
stability  are  performing. 

13  Conclusion 

We  have  presented  some  high  order  numerical  schemes,  mainly  of  the  ENO 
type,  that  are  able  to  procude  very  good  numerical  results.  We  have  also 
considered  the  efficiency  issue,  in  term  of  CPU  cost.  Realistic  example  have 
been  considered. 
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Table  13.1.  log10  of  L°°  for  (6.1) 


function 

e  /i  error  ( L°° )  error  (L1) 

h 

10-2  13.67  2.21  10-2  1.11  10'3 
10~3  3.31  8.67  10-4  1.04  10-4 
10~4  1.17  7.05  10-5  3.37  10"6 

h 

10“2  2.65  2.15  10~2  6.48  10'4 
10-3  1.50  9.17  10-4  6.61  10-5 
10-4  1.02  6.97  10-5  5.27  10-7 

Table  13.2.  Results  for  the  agglomeration  procedure  on  the  domain  with  holes. 
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Table  13.3.  Nodal  errors  for  (6.2)-(6.3) 
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Fig.  13.1.  Elements  of  the  triangulation  and  the  dual  mesh 


Fig.  13.2.  Covering  of  the  rectangle  [xo,  xi]  x  [yo,yi]  by  triangular  control  volumes 


Fig.  13.3.  Stencils  for  third  order  reconstruction 
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Fig.  13.6.  Density  for  M  =  5.5  and  6  =  30°,  (A):second  order  scheme,  (B):  third 
order  scheme 
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Fig.  13.7.  Different  types  of  stencils  for  quadratic  reconstruction. 


58 


High  Order  Approximations  for  Compressible  Fluid  Dynamics 


59 


Fig.  13.9.  Density  for  M  —  5.5  and  9  =  45°,  (A)  :  first  order,  (B)  :  second  order, 
(C):  third  order.  Zoom  around  the  triple  point 


-4s 

Fig.  13.10.  The  stencil  for  the  reconstruction. 


Fig.  13.12.  40  Isolines  of  the  Mach  number 
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Fig.  13.15.  Evolution  of  the  first-order  error  terms. 


A  Details  of  (36) 


The  detailed  expression  of  (37)  for  each  equation  is  written: 


A.l  Continuity  equation 


Pt  H“  (pu)x  T  { “g- (cAf pxxx  T  P^xxx)  T  ^4'{Px'U‘x)x 
+  <'L°3p*  [-(3 pux  +  cMpx )  +  2pcM(LogT)x]Iausm} 


=  0(Ax 3) 
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a.  Density,  AUSM  and  minmod  b.  Density,  AUSM  and  Superbee 


c.  Density,  AUSM  and  phl=(r+2)/3  d.  Density,  AUSM-VL  and  phi-triad 


Fig.  13.16.  Moving  shock  in  a  sinusoidal  density  field. 


Fig.  13.17.  Location  of  degeneration  into  first-order. 
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Fig.  13.18.  Instantaneous  temperature  field  in  a  supersonic  jet. 


A.2  Momentum  equation 

(. pu)t  +  (pu2  +  p)x 

+Ax2{^[c2(M2  +  ^ )pxxx  +  2  pcMuxxx  +  pRTxxx ] 
4”  ^4  [2cil^(pa;Ua; “I-  p(uxUx^x  ”1"  Ri.Px'Rx'jz] 

+  ( LTL  +  if )PxUxUx 

+  (L°l 'P*  [AcMpxux  -  Apuxux  +  c?M2px(LogT)x 


+2pc2M2(LogT)2x]Iausm} 
=  0(Ax3) 
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A. 3  Energy  equation 

(pE)t  +  (puH)x 


+Ax2{f[c3M(*£  +  C2)pxxx 
+pc2(^- - 1-  C2)lLxxx  +  ^-Txxx] 

+f[c2(3Jf  +  C2)(pxux)x  +  3-^-{ux  UX)X  +  <&£(UXTX)X] 


+Y[CPcM(pxTx)x\Ivl 


+  (S/a«jm.  +  El)cM  pxuxux  +  (|  CppxuxTx  +  \puxuxux)Ivi 
+  <'L°iP*  [16c2  (C2  -  ^-)pxux  +  2pcMuxux 


-c3M(3C2  -  )Px(LogT)x  -  pc2(3C2  +  )ux(LogT)x 


+ \pc3  M3  (LogT)l\Iauam  } 


=  0(Ax3) 

where  IaUsm  =  1  and  =  0  for  AUSM  splitting,  and  Iaugm  =  0  and 
Ivi  =  1  for  VL  splitting. 

B  A  piece  of  code 

For  cases  1-4,  the  calculation  of  the  right-  and  left-values  of  U  (MUSCL 
approach)  is  done.  The  lines  written  with  the  capital  letters  already  exist  in 
codes  with  the  classical  limiters.  The  ten  lines  written  with  the  small  letters 
correspond  with  the  adding  of  the  triad  of  limiters.  The  writing  is  very  simple 
and  the  additional  consuming  time  for  a  complete  code  is  very  small. 

c  The  right  and  left  values  are  calculated  as 
C  follows 
c 

c  UR(L+1/2)=U(L+1)-PHI(1/R(L+1)) 
c  *(  U(L+1)-U(L)  ) 

c 

c  UL(L+1/2)=U(L)  +PHI(R(L)) 
c  *(  U(L+1)-U(L)  ) 

c 

c  R(L)=(  U(L)-U(L-1)  )/(  U(L+1)-U(L)  ) 
c 

c  if  R  <=  0 
C  PHI(R)=0 
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C  if  R>0 

c  PHI(R)=  (  (1-ETA)*  MIN(  R,  (3-ETA) /(1-ETA)) 
c  +  (1+ETA)  * 

c  MIN(  1,  (3-ETA) *R/ (1-ETA))  )/4 

c 

c  DELSIGN(L)  =  sign  of  variation  U(L+1)-U(L) 
c  (+1  if  U(L+1)-U(L)=0) 

c 

c  if  isl234=l  and  is23=l  we  have  case  1 
c  if  isl234=0  and  is23  we  have  cases  2  and  3 
c  if  isl234=0  and  is23=0  we  have  case  4 


c 

c  case  1 
c 
c 
c 

c  case  2  and  3 

c 

c 

c  case  4 


PHI(R)=(R+2)/3  if  2/5  <  R  <  4 
2  if  R>4 

2*R  if  0<R<2/5 

PHI (R)=l  if  R>l/2 

2*R  if  0  <  R<  1/2 

PHI(R)=  SUPERBEE (R) 


c 

c  *************  Phi-triad  ************* 


DO  L  =  1,  LMAX-1 
DELU(L)  =  U  (L+l)  -  U  (L) 

DELSIGN(L)  =  SIGN  (1.,  DELU(L)) 

ENDDO 

DO  L  =  3,  LMAX-2 

DEL  =  DELU  (L)  -  DELU  (L-1) 

c***  automatic  choice  of  the  limiter  *** 

etasbee  =  dim(sign(l.,  del),  0.) 

-  dim(0.,  sign(l.,  del)) 

isl  =  delsign  (L-2) 

is2  =  delsign  (L-1) 

is3  =  delsign  (L  ) 
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is4  =  delsign  (L+l) 

isl234  =  iabs  (isl+is2+is3+is4)  /4 

etal3  =  isl234  /  3.  +  (l-isl234) 

is23  =  iabs  (is2+is3)  /2 

eta(L)  =  etal3  *  is23  +  etasbee  *  (l-is23) 

omega  =  is23  *  (3.  -  eta(L)) 

/  (1.  -  eta(L))  +  2.*  (l-is23) 

C  *  sfe  *  *  *  *  *  *  *  *  *  * 

A  =  DELU  (L  )  *  DELSIGN  (L-l) 

B  =  DELU  (L-l)  *  DELSIGN  (L  ) 

ABMIN  =  MIN  (A,0MEGA*B) 

DELUP  (L)  =  MAX (0. , ABMIN)  *  DELSIGN  (L  ) 

ABMIN  =  MIN  (B,0MEGA*A) 

DELUM  (L)  =  MAX(0. , ABMIN)  *  DELSIGN  (L-l) 
ENDDO 

c*U-right  and  U-left  at  the  interface  L+l/2  * 
DO  L  =  3  ,  LMAX-3 
UR  (L)  =  U  (L+l) 

-  0.25*(  (l.-ETA  (L+l))*  DELUP  (L+l) 
+  (l.+ETA  (L+1))*DELUM  (L+l)  ) 

UL  (L)  =  U  (L  ) 

+  0.25*(  (l.-ETA  (L))*  DELUM  (L) 

+  (l.+ETA  (L))*  DELUP (L)  ) 


ENDDO 


Discontinuous  Galerkin  Methods  for 
Convection-Dominated  Problems 


Bernardo  Cockburn1 

School  of  Mathematics,  University  of  Minnesota, 

206  Church  Street  S.E.,  Minneapolis,  MN  55455,  USA 


Abstract.  We  present  and  analyze  the  Runge  Kutta  Discontinuous  Galerkin  method 
for  numerically  solving  nonlinear  hyperbolic  systems.  The  basic  method  is  then  ex¬ 
tended  to  convection-dominated  problems  yielding  the  Local  Discontinuous  Galerkin 
method.  These  methods  axe  particularly  attractive  since  they  achieve  formal  high- 
order  accuracy,  nonlinear  stability,  and  high  parallelizability  while  maintaining  the 
ability  to  handle  complicated  geometries  and  capture  the  discontinuities  or  strong 
gradients  of  the  exact  solution  without  producing  spurious  oscillations.  The  dis¬ 
cussed  methods  axe  readily  applied  to  the  Euler  equations  of  gas  dynamics,  the 
shallow  water  equations,  the  equations  of  magneto-hydrodynamics,  the  compress¬ 
ible  Navier-Stokes  equations  with  high  Reynolds  numbers,  and  the  equations  of 
the  hydrodynamic  model  for  semiconductor  device  simulation.  As  a  final  example, 
consideration  is  given  to  the  application  of  the  discontinuous  Galerkin  method  to 
the  Hamilton-Jacobi  equations. 
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1  Introduction 

1.1  The  purpose  of  these  notes 

In  these  notes,  we  study  the  Runge  Kutta  Discontinuous  Galerkin  method  for  nu¬ 
merically  solving  nonlinear  hyperbolic  systems  and  its  extension  for  convection- 
dominated  problems,  the  so-called  Local  Discontinuous  Galerkin  method.  Examples 
of  problems  to  which  these  methods  can  be  applied  are  the  Euler  equations  of  gas  dy¬ 
namics,  the  shallow  water  equations,  the  equations  of  magneto-hydrodynamics,  the 
compressible  Navier-Stokes  equations  with  high  Reynolds  numbers,  and  the  equa¬ 
tions  of  the  hydrodynamic  model  for  semiconductor  device  simulation;  applications 
to  Hamilton-Jacobi  equations  is  another  important  example.  The  main  features 
that  make  the  methods  under  consideration  attractive  are  their  formal  high-order 
accuracy,  their  nonlinear  stability,  their  high  parallelizability,  their  ability  to  handle 
complicated  geometries,  and  their  ability  to  capture  the  discontinuities  or  strong 
gradients  of  the  exact  solution  without  producing  spurious  oscillations.  The  pur¬ 
pose  of  these  notes  is  to  provide  a  short  introduction  to  the  devising  and  analysis 
of  these  discontinuous  Galerkin  methods.  Most  of  the  material  of  these  notes  has 
been  presented  in  [17]. 

Acknowledgements  The  author  would  like  to  thank  T.J.  Barth  for  the  invi¬ 
tation  to  give  a  series  of  lectures  in  the  NATO  special  course  on  1  Higher  Order 
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is  contained  in  these  notes.  He  would  also  like  to  thank  F.  Bassi  and  S.  Rebay,  and 
I.  Lomtev  and  G.  Karniadakis  for  kindly  suplying  several  of  their  figures,  and  to  C. 
Hu  and  Chi- Wang  Shu  for  provinding  their  material  on  Hamilton-Jacobi  equations. 
Thanks  are  also  due  to  Rosario  Grau  for  fruitful  discussions  concerning  the  numeri¬ 
cal  experiments  of  Chapter  2,  to  J.X.  Yang  for  a  careful  proof-reading  the  appendix 
of  Chapter  6,  and  to  A.  Zhou  for  bringing  the  author’s  attention  to  several  of  his 
papers  concerning  the  discontinuous  Galerkin  method. 

1.2  A  historical  overview 

The  original  Discontinuous  Galerkin  method  The  original  discontinuous 
Galerkin  (DG)  finite  element  method  was  introduced  by  Reed  and  Hill  [76]  for 
solving  the  neutron  transport  equation 

cru  4-  div(au)  =  /, 

where  cr  is  a  real  number  and  a  a  constant  vector.  A  remarkable  advantage  of  this 
method  is  that,  because  of  the  linear  nature  of  the  equation,  the  approximate  solu¬ 
tion  can  be  computed  element  by  element  when  the  elements  are  suitably  ordered 
according  to  the  characteristic  direction. 

LeSaint  and  Raviart  [58]  made  the  first  analysis  of  this  method  and  proved  a 
rate  of  convergence  of  ( Ax)k  for  general  triangulations  and  of  (Ax)k+1  for  Cartesian 
grids.  Later,  Johnson  and  Pitkaranta  [52]  proved  a  rate  of  convergence  of  (Ax)k+1^2 
for  general  triangulations  and  Peterson  [75]  numerically  confirmed  this  rate  to  be 
optimal.  Richter  [77]  obtained  the  optimal  rate  of  convergence  of  ( Ax)k+1  for  some 
structured  two-dimensional  non-Cartesian  grids.  In  all  the  above  papers,  the  exact 
solution  is  assumed  to  be  very  smooth.  The  case  in  which  the  solution  admits  dis¬ 
continuities  was  treated  by  Lin  and  Zhou  [60]  who  proved  the  convergence  of  the 
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method.  The  issue  of  the  interrelation  between  the  mesh  and  the  order  of  conver¬ 
gence  of  the  method  was  explored  by  Zhou  and  Lin  [93],  case  k  =  1,  and  later  by 
Lin,  Yan,  and  Zhou  [59],  case  k  =  0,  and  optimal  error  estimates  were  proven  under 
suitable  assumptions  on  the  mesh.  Recently,  several  new  results  have  been  obtained. 
Thus,  Falk  and  Richter  [39]  obtained  a  rate  of  convergence  of  (Ax)k+1^2  for  general 
triangulations  for  Friedrich  systems;  Houston,  Schwab  and  Siili  [42]  analyzed  the 
hp  version  of  the  discontinuous  Galerkin  method  and  showed  its  exponential  con¬ 
vergence  when  the  solution  is  piecewise  analytic;  and,  finally,  Cockburn,  Luskin, 
Shu,  and  Siili  [22]  showed  how  to  exploit  the  translation  invariance  of  a  grid  to 
double  the  order  of  convergence  of  the  method  by  a  simple,  local  postprocessing  of 
the  approximate  solution. 


Nonlinear  hyperbolic  systems:  The  RKDG  method  The  success  of  this 
method  for  linear  equations,  prompted  several  authors  to  try  to  extend  the  method 
to  nonlinear  hyperbolic  conservation  laws 

d 

Ut  +  =  °> 

«=i 

equipped  with  suitable  initial  or  initial-boundary  conditions.  However,  the  intro¬ 
duction  of  the  nonlinearity  prevents  the  element-by-element  computation  of  the 
solution.  The  scheme  defines  a  nonlinear  system  of  equations  that  must  be  solved 
all  at  once  and  this  renders  it  computationally  very  inefficient  for  hyperbolic  prob¬ 
lems. 

•  The  one-dimensional  scalar  conservation  law. 

To  avoid  this  difficulty,  Chavent  and  Salzano  [13]  constructed  an  explicit  version 
of  the  DG  method  in  the  one-dimensional  scalar  conservation  law.  To  do  that,  they 
discretized  in  space  by  using  the  DG  method  with  piecewise  linear  elements  and 
then  discretized  in  time  by  using  the  simple  Euler  forward  method.  Although  the 
resulting  scheme  is  explicit,  the  classical  von  Neumann  analysis  shows  that  it  is 
unconditionally  unstable  when  the  ratio  ^  is  held  constant;  it  is  stable  if  ™  is  of 
order  V Ax,  which  is  a  very  restrictive  condition  for  hyperbolic  problems. 

To  improve  the  stability  of  the  scheme,  Chavent  and  Cockburn  [12]  modified 
the  scheme  by  introducing  a  suitably  defined  ‘slope  limiter’  following  the  ideas 
introduced  by  van  Leer  in  [88].  They  thus  obtained  a  scheme  that  was  proven  to 
be  total  variation  diminishing  in  the  means  (TVDM)  and  total  variation  bounded 
(TVB)  under  a  fixed  CFL  number,  /'  that  can  be  chosen  to  be  less  than  or  equal 
to  1/2.  Convergence  of  a  subsequence  is  thus  guaranteed,  and  the  numerical  results 
given  in  [12]  indicate  convergence  to  the  correct  entropy  solutions.  However,  the 
scheme  is  only  first  order  accurate  in  time  and  the  ‘slope  limiter’  has  to  balance  the 
spurious  oscillations  in  smooth  regions  caused  by  linear  instability,  hence  adversely 
affecting  the  quality  of  the  approximation  in  these  regions. 

These  difficulties  were  overcome  by  Cockburn  and  Shu  in  [26],  where  the  first 
Runge  Kutta  Discontinuous  Galerkin  (RKDG)  method  was  introduced.  This  method 
was  constructed  (i)  by  retaining  the  piecewise  linear  DG  method  for  the  space  dis¬ 
cretization,  (ii)  by  using  a  special  explicit  TVD  second  order  Runge-Kutta  type 
discretization  introduced  by  Shu  and  Osher  in  a  finite  difference  framework  [80], 
[81],  and  (iii)  by  modifying  the  ‘slope  limiter’  to  maintain  the  formal  accuracy  of  the 
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scheme  extrema.  The  resulting  explicit  scheme  was  then  proven  linearly  stable  for 
CFL  numbers  less  than  1/3,  formally  uniformly  second  order  accurate  in  space  and 
time  including  at  extrema,  and  TVBM.  Numerical  results  in  [26]  indicate  good  con¬ 
vergence  behavior:  Second  order  in  smooth  regions  including  extrema,  sharp  shock 
transitions  (usually  in  one  or  two  elements)  without  oscillations,  and  convergence 
to  entropy  solutions  even  for  non  convex  fluxes. 

In  [24],  Cockburn  and  Shu  extended  this  approach  to  construct  (formally)  high- 
order  accurate  RKDG  methods  for  the  scalar  conservation  law.  To  device  RKDG 
methods  of  order  k  +  1,  they  used  (i)  the  DG  method  with  polynomials  of  de¬ 
gree  k  for  the  space  discretization,  (ii)  a  TVD  ( k  +  l)-th  order  accurate  explicit 
time  discretization,  and  (iii)  a  generalized  ‘slope  limiter.’  The  generalized  ‘slope 
limiter’  was  carefully  devised  with  the  purpose  of  enforcing  the  TVDM  property 
without  destroying  the  accuracy  of  the  scheme.  The  numerical  results  in  [24],  for 
k  —  1,2,  indicate  (k  +  l)-th  order  order  in  smooth  regions  away  from  discontinuities 
as  well  as  sharp  shock  transitions  with  no  oscillations;  convergence  to  the  entropy 
solutions  was  observed  in  all  the  tests.  These  RKDG  schemes  were  extended  to 
one-dimensional  systems  in  [21], 

•  The  multidimensional  case. 

The  extension  of  the  RKDG  method  to  the  multidimensional  case  was  done  in 
[20]  for  the  scalar  conservation  law.  In  the  multidimensional  case,  the  complicated 
geometry  the  spatial  domain  might  have  in  practical  applications  can  be  easily 
handled  by  the  DG  space  discretization.  The  TVD  time  discretizations  remain  the 
same,  of  course.  Only  the  construction  of  the  generalized  ‘slope  limiter’  represents  a 
serious  challenge.  This  is  so,  not  only  because  of  the  more  complicated  form  of  the 
elements  but  also  because  of  inherent  accuracy  barriers  imposed  by  the  stability 
properties. 

Indeed,  since  the  main  purpose  of  the  ‘slope  limiter’  is  to  enforce  the  nonlin¬ 
ear  stability  of  the  scheme,  it  is  essential  to  realize  that  in  the  multidimensional 
case,  the  constraints  imposed  by  the  stability  of  a  scheme  on  its  accuracy  are  even 
greater  than  in  the  one  dimensional  case.  Although  in  the  one  dimensional  case  it  is 
possible  to  devise  high-order  accurate  schemes  with  the  TVD  property,  this  is  not 
so  in  several  space  dimensions  since  Goodman  and  LeVeque  [41]  proved  that  any 
TVD  scheme  is  at  most  first  order  accurate.  Thus,  any  generalized  ‘slope  limiter’ 
that  enforces  the  TVD  property,  or  the  TVDM  property  for  that  matter,  would 
unavoidably  reduce  the  accuracy  of  the  scheme  to  first-order  accuracy.  This  is  why 
in  [20],  Cockburn,  Hou  and  Shu  devised  a  generalized  ‘slope  limiter’  that  enforced 
a  local  maximum  principle  only  since  they  are  not  incompatible  with  high-order 
accuracy.  No  other  class  of  schemes  has  a  proven  maximum  principle  for  general 
nonlinearities  f  and  arbitrary  triangulations. 

The  extension  of  the  RKDG  methods  to  general  multidimensional  systems  was 
started  by  Cockburn  and  Shu  in  [25]  and  has  been  recently  completed  in  [28].  Bey 
and  Oden  [10],  Bassi  and  Rebay  [4],  and  more  recently  Baumann  [6]  and  Bau¬ 
mann  and  Oden  [9]  have  studied  applications  of  the  method  to  the  Euler  equa¬ 
tions  of  gas  dynamics.  Recently,  Kershaw  ef  al.  [56],  from  the  Lawrence  Livermore 
National  Laboratory,  extended  the  method  to  arbitrary  Lagrangian-Eulerian  fluid 
flows  where  the  computational  mesh  can  move  to  track  the  interface  between  the 
different  material  species. 

•  The  main  advantages  of  the  RKDG  method. 
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The  resulting  RKDG  schemes  have  several  important  advantages.  First,  like 
finite  element  methods  such  as  the  SUPG-method  of  Hughes  and  Brook  [44],  [49], 
[45],  [46],  [47],  [48]  (which  has  been  analyzed  by  Johnson  et  al.  in  [53],  [54],  [55]), 
the  RKDG  methods  are  better  suited  than  finite  difference  methods  to  handle 
complicated  geometries.  Moreover,  the  particular  finite  elements  of  the  DG  space 
discretization  allow  an  extremely  simple  treatment  of  the  boundary  conditions;  no 
special  numerical  treatment  of  them  is  required  in  order  to  achieve  uniform  high 
order  accuracy,  as  is  the  case  for  the  finite  difference  schemes. 

Second,  the  method  can  easily  handle  adaptivity  strategies  since  the  refining 
or  unrefining  of  the  grid  can  be  done  without  taking  into  account  the  continuity 
restrictions  typical  of  conforming  finite  element  methods.  Also,  the  degree  of  the 
approximating  polynomial  can  be  easily  changed  from  one  element  to  the  other. 
Adaptivity  is  of  particular  importance  in  hyperbolic  problems  given  the  complexity 
of  the  structure  of  the  discontinuities.  In  the  one  dimensional  case  the  Riemann 
problem  can  be  solved  in  closed  form  and  discontinuity  curves  in  the  ( x ,  t)  plane  are 
simple  straight  lines  passing  through  the  origin.  However,  in  two  dimensions  their 
solutions  display  a  very  rich  structure;  see  the  works  of  Wagner  [90],  Lindquist 
[62],  [61],  Tong  and  Zheng  [86],  and  Tong  and  Chen  [85].  Thus,  methods  which 
allow  triangulations  that  can  be  easily  adapted  to  resolve  this  structure,  have  an 
important  advantage. 

Third,  the  method  is  highly  parallelizable.  Since  the  elements  are  discontinuous, 
the  mass  matrix  is  block  diagonal  and  since  the  order  of  the  blocks  is  equal  to  the 
number  of  degrees  of  freedom  inside  the  corresponding  elements,  the  blocks  can 
be  inverted  by  hand  once  and  for  all.  Thus,  at  each  Runge-Kutta  inner  step,  to 
update  the  degrees  of  freedom  inside  a  given  element,  only  the  degrees  of  freedom 
of  the  elements  sharing  a  face  are  involved;  communication  between  processors  is 
thus  kept  to  a  minimum.  Extensive  studies  of  adaptivity  and  parallelizability  issues 
of  the  RKDG  method  have  been  performed  by  Biswas,  Devine,  and  Flaherty  [11], 
Devine,  Flaherty,  Loy,  and  Wheat  [32],  Devine  and  Flaherty  [31],  and  more  recently 
by  Flaherty  et  al.  [40].  Studies  of  load  balancing  related  to  conservation  laws  but 
not  restricted  to  them  can  be  found  in  the  works  by  Devine,  Flaherty,  Wheat,  and 
Maccabe  [33],  by  deCougny  et  al.  [30],  and  by  Ozturan  et  al.  [74], 

Convection-diffusion  systems:  The  LDG  method  The  first  extensions  of 
the  RKDG  method  to  nonlinear,  convection-diffusion  systems  of  the  form 

dtu  +  V  •  F(u,  Du)  =  0,  in  (0, T)  x  fl, 

were  proposed  by  Chen  et  al.  [15],  [14]  in  the  framework  of  hydrodynamic  models 
for  semiconductor  device  simulation.  In  these  extensions,  approximations  of  second 
and  third-order  derivatives  of  the  discontinuous  approximate  solution  were  obtained 
by  using  simple  projections  into  suitable  finite  elements  spaces.  This  projection 
requires  the  inversion  of  global  mass  matrices,  which  in  [15]  and  [14]  were  ‘lumped’ 
in  order  to  maintain  the  high  parallelizability  of  the  method.  Since  in  [15]  and 
[14]  polynomials  of  degree  one  are  used,  the  ‘mass  lumping’  is  justified;  however,  if 
polynomials  of  higher  degree  were  used,  the  ‘mass  lumping’  needed  to  enforce  the 
full  parallelizability  of  the  method  could  cause  a  degradation  of  the  formal  order  of 
accuracy. 

Fortunately,  this  is  not  an  issue  with  the  methods  proposed  by  Bassi  and  Rebay 
[3]  (see  also  Bassi  et  al  [4])  for  the  compressible  Navier-Stokes  equations.  In  these 
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methods,  the  original  idea  of  the  RKDG  method  is  applied  to  both  u  and  D  u  which 
are  now  considered  as  independent  unknowns.  Like  the  RKDG  methods,  the  result¬ 
ing  methods  are  highly  parallelizable  methods  of  high-order  accuracy  which  are 
very  efficient  for  time-dependent,  convection-dominated  flows.  The  LDG  methods 
considered  by  Cockburn  and  Shu  [27]  are  a  generalization  of  these  methods. 

The  basic  idea  to  construct  the  LDG  methods  is  to  suitably  rewrite  the  original 
system  as  a  larger,  degenerate,  first-order  system  and  then  discretize  it  by  the 
RKDG  method.  By  a  careful  choice  of  this  rewriting,  nonlinear  stability  can  be 
achieved  even  without  slope  limiters,  just  as  the  RKDG  method  in  the  purely 
hyperbolic  case;  see  Jiang  and  Shu  [51].  Moreover,  error  estimates  (in  the  linear 
case)  have  been  obtained  in  [27].  A  recent  analysis  of  this  method  is  currently 
being  carried  out  by  Cockburn  and  Schwab  [23]  in  the  one  dimensional  case  by 
taking  into  account  the  characterization  of  the  viscous  boundary  layer  of  the  exact 
solution. 

The  LDG  methods  [27]  are  very  different  from  the  so-called  Discontinuous 
Galerkin  (DG)  method  for  parabolic  problems  introduced  by  Jamet  [50]  and  stud¬ 
ied  by  Eriksson,  Johnson,  and  Thomee  [38],  Eriksson  and  Johnson  [34],  [35],  [36], 
[37],  and  more  recently  by  Makridakis  and  Babuska  [68].  In  the  DG  method,  the 
approximate  solution  is  discontinuous  only  in  time,  not  in  space;  in  fact,  the  space 
discretization  is  the  standard  Galerkin  discretization  with  continuous  finite  ele¬ 
ments.  This  is  in  strong  contrast  with  the  space  discretizations  of  the  LDG  methods 
which  use  discontinuous  finite  elements.  To  emphasize  this  difference,  those  meth¬ 
ods  are  called  Local  Discontinuous  Galerkin  methods.  The  large  amount  of  degrees 
of  freedom  and  the  restrictive  conditions  of  the  size  of  the  time  step  for  explicit 
time-discretizations,  render  the  LDG  methods  inefficient  for  diffusion-dominated 
problems;  in  this  situation,  the  use  of  methods  with  continuous-in-space  approx¬ 
imate  solutions  is  recommended.  However,  as  for  the  successful  RKDG  methods 
for  purely  hyperbolic  problems,  the  extremely  local  domain  of  dependency  of  the 
LDG  methods  allows  a  very  efficient  parallelization  that  by  far  compensates  for 
the  extra  amount  of  degrees  of  freedom  in  the  case  of  convection-dominated  flows. 
Karniadakis  et  al.  have  implemented  and  tested  these  methods  for  the  compressible 
Navier  Stokes  equations  in  two  and  three  space  dimensions  with  impressive  results; 
see  [64],  [65],  [63],  [66],  and  [91]. 

Another  technique  to  discretize  the  diffusion  terms  have  been  proposed  by  Bau¬ 
mann  [6].  The  one-dimensional  case  was  studied  by  Babuska,  Baumann,  and  J.T. 
Oden  [2]  and  the  multidimensional  case  has  been  considered  by  Oden,  Babuska, 
and  Baumann  [70].  The  case  of  convection- diffusion  in  multidimensions  was  treated 
by  Baumann  and  Oden  in  [7].  In  [8],  Baumann  and  Oden  consider  applications  to 
the  Navier-Stokes  equations. 

Finally,  let  us  point  bring  the  attention  of  the  reader  to  the  non-conforming 
staggered-grid  Chebyshev  spectral  multidomain  numerical  method  for  the  solution 
of  the  compressible  Navier-Stokes  equations  proposed  and  studied  by  Kopriva  [57]; 
this  method  is  strongly  related  to  discontinuous  Galerkin  methods. 

1.3  The  content  of  these  notes 

In  these  notes,  we  study  the  RKDG  and  LDG  methods.  Our  exposition  will  be 
based  on  the  papers  by  Cockburn  and  Shu  [26],  [24],  [21],  [20],  and  [28]  in  which 
the  RKDG  method  was  developed  and  on  the  paper  by  Cockburn  and  Shu  [27] 
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which  is  devoted  to  the  LDG  methods.  We  also  include  numerical  results  from  the 
papers  by  Bassi  and  Rebay  [4]  and  by  Warburton,  Lomtev,  Kirby  and  Karniadakis 
[91]  on  the  Euler  equations  of  gas  dynamics  and  from  the  papers  by  Bassi  and 
Rebay  [3]  and  by  Lomtev  and  Karniadakis  [63]  on  the  compressible  Navier-Stokes 
equations.  Finally,  we  also  use  the  material  contained  in  the  paper  by  Hu  and  Shu 
[43]  in  which  the  application  of  the  RKDG  method  is  extended  to  Hamilton-Jacobi 
equations. 

The  emphasis  in  these  notes  is  on  how  the  above  mentioned  schemes  were  de¬ 
vised.  As  a  consequence,  the  chapters  that  follow  reflect  that  development.  Thus, 
Chapter  2,  in  which  the  RKDG  schemes  for  the  one-dimensional  scalar  conserva¬ 
tion  law  are  constructed,  constitutes  the  core  of  the  notes  because  it  contains  all 
the  important  ideas  for  the  devising  of  the  RKDG  methods;  chapter  3  contains  its 
extension  to  one-dimensional  Hamilton-Jacobi  equations.  In  chapter  4,  we  extend 
the  RKDG  method  to  multidimensional  systems  and  in  Chapter  5,  to  multidimen¬ 
sional  Hamilton-Jacobi  equations.  Finally,  in  chapter  6  we  study  the  extension  to 
convection-diffusion  problems. 

We  would  like  to  emphasize  that  the  guiding  principle  in  the  devising  of  the 
RKDG  methods  for  scalar  conservation  laws  is  to  consider  them  as  perturbations 
of  the  so-called  monotone  schemes.  As  it  is  well-known,  monotone  schemes  for 
scalar  conservation  laws  are  stable  and  converge  to  the  entropy  solution  but  are 
only  first-order  accurate.  Following  a  widespread  approach  in  the  field  of  numerical 
schemes  for  nonlinear  conservation  laws,  the  RKDG  are  constructed  in  such  a  way 
that  they  are  high-order  accurate  schemes  that  ‘become’  a  monotone  scheme  when 
a  piecewise-constant  approximation  is  used.  Thus,  to  obtain  high-order  accurate 
RKDG  schemes,  we  ‘perturb’  the  piecewise-constant  approximation  and  allow  it  to 
be  piecewise  a  polynomial  of  arbitrary  degree.  Then,  the  conditions  under  which 
the  stability  properties  of  the  monotone  schemes  are  still  valid  are  sought  and  en¬ 
forced  by  means  of  the  generalized  ‘slope  limiter.’  The  fact  that  it  is  possible  to  do 
so  without  destroying  the  accuracy  of  the  RKDG  method  is  the  crucial  point  that 
makes  this  method  both  robust  and  accurate. 

The  issues  of  parallelization  and  adaptivity  developed  by  Biswas,  Devine,  and 
Flaherty  [11],  Devine,  Flaherty,  Loy,  and  Wheat  [32],  Devine  and  Flaherty  [31],  and 
by  Flaherty  et  al.  [40]  (see  also  the  works  by  Devine,  Flaherty,  Whea,  and  Mac- 
cabe  [33],  by  deCougny  et  al.  [30],  and  by  Ozturan  et  al.  [74])  are  certainly  very 
important.  Another  issue  of  importance  is  how  to  render  the  method  computa¬ 
tionally  more  efficient,  like  the  quadrature  rule-free  versions  of  the  RKDG  method 
recently  studied  by  Atkins  and  Shu  [1].  However,  these  topics  fall  beyond  the  scope 
of  these  notes  whose  main  intention  is  to  provide  a  simple  introduction  to  the  topic 
of  discontinuous  Galerkin  methods  for  convection-dominated  problems. 


Discontinuous  Galerkin  Methods 


77 


2  The  scalar  conservation  law  in  one  space  dimension 

2.1  Introduction 

In  this  section,  we  introduce  and  study  the  RKDG  method  for  the  following  simple 
model  problem: 


Ut  +  f(u)x=  0,  in  (0, 1)  x  (0,T),  (2.1) 

u(x,  0)  =  uo(x),  V  xe  (0,1),  (2.2) 

and  periodic  boundary  conditions.  This  section  has  material  drawn  from  [26]  and 
[24]. 

2.2  The  discontinuous  Galerkin-space  discretization 

The  weak  formulation  To  discretize  in  space,  we  proceed  as  follows.  For  each 
partition  of  the  interval  (0,1),  {  Xj+1/2  }j=o>  we  set  Ij  =  (xj-i/2,Xj+i/2),  Aj  — 
xj+i/2  ~  *7-1/2  for  j  =  1, . . . ,  N,  and  denote  the  quantity  max^j^jv  Aj  by  Ax  . 

We  seek  an  approximation  Uh  to  u  such  that  for  each  time  t  6  [0, T],  Uh(t) 
belongs  to  the  finite  dimensional  space 

Vh  =  Vt  =  {veL1(0,l)  -.  v\IjePk(Ij),  j  =  l,...,N},  (2.3) 

where  Pk(I)  denotes  the  space  of  polynomials  in  I  of  degree  at  most  k.  In  order  to 
determine  the  approximate  solution  Uh,  we  use  a  weak  formulation  that  we  obtain 
as  follows.  First,  we  multiply  the  equations  (2.1)  and  (2.2)  by  arbitrary,  smooth 
functions  v  and  integrate  over  Ij,  and  get,  after  a  simple  formal  integration  by 
parts, 


/  dt  u(x,  t)  v(x)  dx  —  /  f(u(x,t))dxv(x)dx 

Jij  Jij 

+f(u(xj+1/2,t))v{xJ+1/2)  -  f(u(xj-i/2,t))  v(x+_1/2)  =  0, 
J  u(x,  0)  v(x)  dx  =  J  uo(x)v(x)dx. 


(2.4) 

(2.5) 


Next,  we  replace  the  smooth  functions  v  by  test  functions  Vh  belonging  to  the  finite 
element  space  14,  and  the  exact  solution  u  by  the  approximate  solution  uh-  Since 
the  function  u h  is  discontinuous  at  the  points  Xj+ 1/2,  we  must  also  replace  the 
nonlinear  ‘flux’  f{u{xj+i/2,t))  by  a  numerical  ‘flux’  that  depends  on  the  two  values 
of  Uh  at  the  point  (xj+i/2,  t),  that  is,  by  the  function 


h{u)j+l/2(t)  —  h(u(Xj_^1/2yt)lu(x^+l/2’t))’  (2.6) 

that  will  be  suitably  chosen  later.  Note  that  we  always  use  the  same  numerical  flux 
regardless  of  the  form  of  the  finite  element  space.  Thus,  the  approximate  solution 
given  by  the  DG-space  discretization  is  defined  as  the  solution  of  the  following  weak 
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formulation: 


V  j  =  V  £  Pk(Ij)  : 


/  dtUh(x,t)vh(x)dx  -  /  f(uh(x,t))dxvh(x)dx 

Jh  Jl, 

+h{uk)j+i/2(t)  Vh{xJ+1/2)  -  h(uh)j-1/2(t)  Vh(xf_1/ 
/  Uh(x,0)vh{x)dx  =  /  tto  (x)vh(x)dx. 

J'i  Jh 


(2.7) 

(2.8) 


Incorporating  the  monotone  numerical  fluxes  To  complete  the  definition 
of  the  approximate  solution  u/M  it  only  remains  to  choose  the  numerical  flux  h.  To 
do  that,  we  invoke  our  main  point  of  view,  namely,  that  we  want  to  construct 
schemes  that  are  perturbations  of  the  so-called  monotone  schemes.  The  idea  is  that 
by  perturbing  the  monotone  schemes,  we  would  achieve  high-order  accuracy  while 
keeping  their  stability  and  convergence  properties.  Thus,  we  want  that  in  the  case 
k  =  0,  that  is,  when  the  approximate  solution  uu  is  a  piecewise-constant  function, 
our  DG-space  discretization  gives  rise  to  a  monotone  scheme. 

Since  in  this  case,  for  x  €  Ij  we  can  write 

uh(x,t)  =  u°, 

we  can  rewrite  our  weak  formulation  (2.7),  (2.8)  as  follows: 

V  j  = 

dt  v!j (t)  +  {h(u-(t),u°+1{t))  -  h{u°j_l(t),u°(t))}/Aj  =  0, 


and  it  is  well-known  that  this  defines  a  monotone  scheme  if  h(a,  b)  is  a  Lipschitz, 
consistent,  monotone  flux,  that  is,  if  it  is, 

(i)  locally  Lipschitz  and  consistent  with  the  flux  /(w),  i.e. ,  h(u,u)  =  f{u), 

(ii)  a  nondecreasing  function  of  its  first  argument,  and 

(iii)  a  nonincreasing  function  of  its  second  argument. 

The  best-known  examples  of  numerical  fluxes  satisfying  the  above  properties  are 
the  following: 

(i)  The  Godunov  flux: 

^  _  fmina<„<6  f(u),  if  a  <  b 

1  maxi<„<0  /(u),  otherwise. 


(ii)  The  Engquist-Osher  flux: 

rb 
1 0 


hEO(a,b)  =  f  min(/'(s),0)  ds  +  f  max(/'(s), 0)  ds  +  /(0); 

Jo  Jo 
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(iii)  The  Lax-Friedrichs  flux: 


hLF{a,b)=L-  [f(a)  +  f(b)-C(b-a)]} 
C  =  max  |/'(s)|; 

inf  ii°(®)<3<supii°(a:) 


(iv)  The  local  Lax-Friedrichs  flux: 


hLLF(a,b)  =  i  [/(a)  +  f(b)  -  C(b  -  a)], 
C  =  max  |/'(s)|; 

min(a,&)<s<max(o,6) 


(v)  The  Roe  flux  with  ‘entropy  fix’: 


hR(a,b) 


f(a)  if  f'(u)  >0  for  u  (E  [min(a,  b),  max(a,6)], 

•  f(b)  if  f'(u)  <  0  for  u  6  [min(a,  fc),max(a,6)], 

hLLF(a,b)  otherwise. 


For  the  flux  h,  we  can  use  the  Godunov  flux  hG  since  it  is  well-known  that  this 
is  the  numerical  flux  that  produces  the  smallest  amount  of  artificial  viscosity.  The 
local  Lax-Friedrichs  flux  produces  more  artificial  viscosity  than  the  Godunov  flux, 
but  their  performances  are  remarkably  similar.  Of  course,  if  f  is  too  complicated,  we 
can  always  use  the  Lax-Friedrichs  flux.  However,  numerical  experience  suggests  that 
as  the  degree  k  of  the  approximate  solution  increases,  the  choice  of  the  numerical 
flux  does  not  have  a  significant  impact  on  the  quality  of  the  approximations. 


Diagonalizing  the  mass  matrix  If  we  choose  the  Legendre  polynomials  Pi 
as  local  basis  functions,  we  can  exploit  their  L2-orthogonality,  namely, 

J^Pi(s)Pe,(s)dS=  (^i)  Sit', 

to  obtain  a  diagonal  mass  matrix.  Indeed,  if,  for  x  G  Ij,  we  express  our  approximate 
solution  Uh  as  follows: 

k 

Uh{x,t)  =  Uj  ipi(x), 

1= 0 

where 


Vi(x)  =  Pe(2(x-xj)/Aj), 

the  weak  formulation  (2.7),  (2.8)  takes  the  following  simple  form: 


V  j  =  1, . . . ,  N  and  i  =  0, . . . ,  k  : 

(^Tl)  dtU^  ~  Zj/.  f(uh(x,t))dx<pt(x)dx 

+  ^r{  h(uh(xj+i/2))(t)  ~  (-!)'  /i(ufe(r^i/2))(t)}  =0, 
uHQ)  =  ~r~[  uo(x)<pt(x)dx, 
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where  we  have  use  the  following  properties  of  the  Legendre  polynomials: 

Pii  1)  =  1,  Pt{~  1)  =  (-!/• 

This  shows  that  after  discretizing  in  space  the  problem  (2.1),  (2.2)  by  the  DG 
method,  we  obtain  a  system  of  ODEs  for  the  degrees  of  freedom  that  we  can  rewrite 
as  follows: 


^uh=Lh(uh),  in  (0,T),  (2.9) 

Uh(t  =  0)  =  u0h-  (2.10) 

The  element  Lh(uh )  of  14  is,  of  course,  the  approximation  to  —f(u)x  provided  by 
the  DG-space  discretization. 

Note  that  if  we  choose  a  different  local  basis,  the  local  mass  matrix  could  be  a 
full  matrix  but  it  will  always  be  a  matrix  of  order  (k  4- 1).  By  inverting  it  by  means 
of  a  symbolic  manipulator,  we  can  always  write  the  equations  for  the  degrees  of 
freedom  of  Uh  as  an  ODE  system  of  the  form  above. 


Convergence  analysis  of  the  linear  case  In  the  linear  case  f(u)  =  cu,  the 
L°°(0,  T;  L2(0,  l))-accuracy  of  the  method  (2.7),  (2.8)  can  be  established  by  using 
the  L°°(0,T;  Z,2( 0,  Instability  of  the  method  and  the  approximation  properties  of 
the  finite  element  space  14. 

Note  that  in  this  case,  all  the  fluxes  displayed  in  the  examples  above  coincide 
and  axe  equal  to 

fi(o,6)  =  c^-i|i(6-o).  (2.11) 

The  following  results  axe  thus  for  this  numerical  flux. 

We  state  the  L2-stability  result  in  terms  of  the  jumps  of  Uh  across  Xj+i/2  which 
we  denote  by 

[uh]j+l/2  =  1/2)  ~  uh(Xj  + 1/2). 


Proposition  1.  (L2-stability)  We  have, 

3  II  uh{T)  11^2(0,1)  +  &T{uh  )  <  2  II  u°  Hl2(o,i)> 

where 


0t(uh)  =  if!  J0T  El <,<*  [Mt)]2j+1/2dt. 


Note  how  the  jumps  of  Uh  are  controlled  by  the  L2-norm  of  the  initial  condition. 
This  control  reflects  the  subtle  built-in  dissipation  mechanism  of  the  DG-methods 
and  is  what  allows  the  DG-methods  to  be  more  accurate  than  the  standard  Galerkin 
methods.  Indeed,  the  standard  Galerkin  method  has  an  order  of  accuracy  equal  to 
k  whereas  the  DG-methods  have  an  order  of  accuracy  equal  to  k  + 1/2  for  the  same 
smoothness  of  the  initial  condition. 
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Theorem  2.  (First  L2-error  estimate)  Suppose  that  the  initial  condition  uo  belongs 
to  Hk+1( 0, 1).  Let  e  be  the  approximation  error  u  —  uu-  Then  we  have, 

l|e(T)||L2(0jl)  <  C\u0\Hk+H0A)(Ax)k+1'2, 

where  C  depends  solely  on  k,  |c|,  and  T. 


It  is  also  possible  to  prove  the  following  result  if  we  assume  that  the  initial 
condition  is  more  regular.  Indeed,  we  have  the  following  result. 


Theorem  3.  (Second  L2-error  estimate)  Suppose  that  the  initial  condition  uo  be¬ 
longs  to  Hk+2( 0, 1).  Let  e  be  the  approximation  error  u  —  Uh ■  Then  we  have, 

II  e{T)  ||l2(o,i)  <  C I  wo  |Ht+2(o,i)(^a:)A:+1> 
where  C  depends  solely  on  k,  |c|,  and  T. 


Theorem  2  is  a  simplified  version  of  a  more  general  result  proven  in  1986  by 
Johnson  and  Pitkaranta  [52]  and  Theorem  3  is  a  simplified  version  of  a  more  general 
result  proven  in  1974  by  LeSaint  and  Raviart  [58].  To  provide  a  simple  introduction 
to  the  techniques  used  in  these  general  results,  we  give  new  proofs  of  Theorems  2 
and  3  in  an  appendix  to  this  chapter. 

The  above  theorems  show  that  the  DG-space  discretization  results  in  a  (&+l)th- 
order  accurate  scheme,  at  least  in  the  linear  case.  This  gives  a  strong  indication 
that  the  same  order  of  accuracy  should  hold  in  the  nonlinear  case  when  the  exact 
solution  is  smooth  enough,  of  course. 

Now  that  we  know  that  the  DG-space  discretization  produces  a  high-order  ac¬ 
curate  scheme  for  smooth  exact  solutions,  we  consider  the  question  of  how  does  it 
behave  when  the  flux  is  a  nonlinear  function. 


Convergence  analysis  in  the  nonlinear  case  To  study  the  convergence 
properties  of  the  DG-method,  we  first  study  the  convergence  properties  of  the  so¬ 
lution  w  of  the  following  problem: 

wt  +  f{w)x  =  {v(w)  wx)x,  (2.12) 

w(-,0)  =  it0(-)>  (2.13) 

and  periodic  boundary  conditions.  We  then  mimic  the  procedure  to  study  the 
convergence  of  the  DG-method  for  the  piecewise-constant  case.  The  general  DG- 
method  will  be  considered  later  after  having  introduced  the  Runge-Kutta  time- 
discretization. 

The  continuous  case  as  a  model.  In  order  to  compare  u  and  w,  it  is  enough 
to  have  (i)  an  entropy  inequality  and  (ii)  uniform  boundedness  of  ||  tu,  ||je,i  (o,i)  - 
Next,  we  show  how  to  obtain  these  properties  in  a  formal  way. 

We  start  with  the  entropy  inequality.  To  obtain  such  an  inequality,  the  basic 
idea  is  to  multiply  the  equation  (2.12)  by  U'(w—c),  where  [/(•)  denotes  the  absolute 
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value  function  and  c  denotes  an  arbitrary  real  number.  Since 

U'(w  —  c)wt  =  U(w  —  c)t, 

U'(w  -  c)  f(w)x  -  ( U'(w  -  c )  (f(w)  -  /(c))) 
=  F(  w,c)x, 


and  since 


U' (w  -  c)  (v(w)wx)x  —  (  f  U'(p  -  c)v(p)dp\  -U"(w  —  c)  v 

\Jc  /XX 


(w)  (wx)2 


■  $(w,c)xx  —  U"(w  —  c)  v(w)  (till)2. 


we  obtain 


U(w  —  c)t  +  F{w,c)x  -  $(w,c)x  <  0, 


which  is  nothing  but  the  entropy  inequality  we  wanted. 

To  obtain  the  uniform  boundedness  of  ||  wx  [|Li  (0,1),  the  idea  is  to  multiply  the 
equation  (2.12)  by  —([/'( wx))x  and  integrate  on  x  from  0  to  1.  Since 


J  -(%)).Wi=|  U'(wx)(wx)t^\\wx  ||l,l(0,l), 

f  ~{U'(wx))x  f(w)x  =  -  f  U"(wx)  wxx  }'{w)wx  =  0, 
Jo  Jo 


and  since 


[  -(U'(wx))x  (v(w)  wx)x  =  -  f  u"{wx)  Wxx  {v'{w)  (wx)2  +  v{w)  Wxx) 
Jo  Jo 

=  —  [  U"(wx)  u(w)  {wxx)2 
Jo 


we  immediately  get  that 


<0, 


^11  «"*  || £,1  (0,1)  <  0, 


and  so, 


II  llz-1  (0,1)  <  II  (uo)x  ||li(0,i)j  6  (0 ,T). 

When  the  function  uo  has  discontinuities,  the  same  result  holds  with  the  total  vari¬ 
ation  of  u o  ,|  uo  |tv(o,i),  replacing  the  quantity  ||  (uo)x  lli-i (o,i) !  these  two  quantities 
coincide  when  uo  €  W1,1(0,1). 

With  the  two  above  ingredients,  the  following  error  estimate,  obtained  in  1976 
by  Kuznetsov,  can  be  proved: 


Theorem  4.  (L—error  estimate)  We  have 

II  U(T)  ~  W{T)  ILi(o,i)  <  I  wo  |tv(o,i)  V8T  u, 
where  v  =  sups6[inf  „0,supuo]  v(s). 
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The  piecewise- constant  case.  Let  consider  the  simple  case  of  the  DG-method 
that  uses  a  piecewise-constant  approximate  solution: 


dt  Uj  +  {h(uj,uj+ 1)  -  h(uj-i,Uj)}/Aj  =  0, 
uj(0)  =  u0(x)dx, 

where  we  have  dropped  the  superindex  ‘0.’  We  pick  the  numerical  flux  h  to  be  the 
Engquist-Osher  flux. 

According  to  the  model  provided  by  the  continuous  case,  we  must  obtain  (i)  an 
entropy  inequality  and  (ii)  the  uniform  boundedness  of  the  total  variation  of  uh- 
To  obtain  the  entropy  inequality,  we  multiply  our  equation  by  U'  (y,j  —  c ) : 

dtU(uj  -  c)  +  U'(uj  -  c){h(uj,uj+ 1)  -  h(uj-i,v.j)}/Aj  =  0. 

The  second  term  in  the  above  equation  needs  to  be  carefully  treated.  First,  we 
rewrite  the  Engquist-Osher  flux  in  the  following  form: 

hEO(a,b)  =  f+(a)  +  r(.b), 


and,  accordingly,  rewrite  the  second  term  of  the  equality  above  as  follows: 

STj  =  U'(uj  -  c){/+(«j)  -  f+(Uj- 1)} 

+U\uj  -  c){f~(uj+i)  -  f~{uj)}. 

Using  the  simple  identity 

U'{a  -  c)(g(a)  -  g(b))  =  G(a,c )  -  G{b,c)  +  f  ( g{b )  -  g(p))  U"{p  -  c)  dp. 

J  a 

where  G(a,  c )  =  /“  U'(p  —  c)  g'(p)  dp,  we  get 

STj  =  F+(uj,c)  -  F+(uj-i,c) 

+  [  (/+(%-i)  ~  f+(p))  U"(p  -  x )  dp 

Juj 

+F~(uj+i,c)  -  F~ (uj , c) 

-/  (/“(«j+i)  -  r(p))U"{p-x)dp 

J  Uj 

—  F(uj,Uj+l,c)  F(Uj-l,Uj\c)  @diss,j 


F(a,b\c)  =  F+(a,c)  +  F  (b,c), 

&diss,j  =  +  [  (/+(wj-i)  -  f+{p))  U"(p  -  x )  dp 

J  Uj 

~[  (/“(«j+i)  ~  f~(p))U"(p-x)dp. 

J  Uj 


where 
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We  thus  get 


dt  U (uj  c)  -f*  ^ F ('tij ,  Uj+i ,  c)  F(uj-i,Uj]  c)}/Aj  -f~  ©dtss,j / Aj  —  0. 

Since,  /+  and  —f~  are  nondecreasing  functions,  we  easily  see  that 

®disstj  ^  0, 

and  we  obtain  our  entropy  inequality: 

dt  U(uj  -  c)  +  {F(uj,uj+i]c)  -  F(uj-i,Uj-,c)}/Aj  <  0. 

Next,  we  obtain  the  uniform  boundedness  on  the  total  variation.  To  do  that,  we 
follow  our  model  and  multiply  our  equation  by  a  discrete  version  of  —(U'(wx))x, 
namely, 


1  fTr’(u3+i-uj\  _  jri  ( ui  -  ui-  A  1 

4i  \  1/2 ;  v  4-1/2 


where  4+1/2  =  (4  +  4+i)/2,  multiply  it  by  Aj  and  sum  over  j  from  1  to  N. 
We  easily  obtain 


dt 


I  |tv(o,i)  +  ^2  v<i  {Mui>%'+i)  ~  h(uj-uUj)}  =  0, 


1  <j<N 


where 


I  Uh  |tv(0,i)  =  *22  I  u/+1  “  ui  I- 

1  <3<N 


According  to  our  continuous  model,  the  second  term  in  the  above  equality 
should  be  positive.  This  is  indeed  the  case  since  the  expression 

v°  {h(uj,uj+i)  -  h{uj-i,uj)} 


is  equal  to  the  quantity 

vj  {f+M  -  /+(Wj-i)}  +  v°  {f~(uj+i)  -  f~{u j)}, 

which  is  nonnegative  by  the  definition  of  v® ,  f+,  and  f~ .  This  implies  that 

I  Uh{t)  |rv(o,i)  <  I  «h(0)  |rv(o,i)  <  |  «o  |tv(o,i)-  (2.14) 

With  the  two  above  ingredients,  the  following  error  estimate,  obtained  in  1976 
by  Kuznetsov,  can  be  proved: 


Theorem  5.  (L1-error  estimate)  We  have 


II  U(T)  ~  Uh(T )  || £,i (o,i)  <  II  uo  ~  n/i(0)  ||ii(o,i)  +  C  |  no  \tv(o,\)'</T  Ax. 
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The  general  case.  Error  estimates  for  the  case  of  arbitrary  k  have  not  been 
obtained,  yet.  However,  Jiang  and  Shu  [51]  found  a  very  interesting  result  in  the 
case  in  which  the  nonlinear  flux  /  is  strictly  convex  or  concave.  In  such  a  situation, 
the  existence  of  a  discrete,  local  entropy  inequality  for  the  scheme  for  only  a  single 
entropy  is  enough  to  guarantee  that  the  limit  of  the  scheme,  if  it  exists,  is  the 
entropy  solution.  Jiang  and  Shu  [51]  found  such  a  discrete,  local  entropy  inequality 
for  the  DG-method. 

To  describe  the  main  idea  of  their  result,  let  us  first  consider  the  model  equation 
ut  +  }{u)x  =  ( vux)x . 


If  we  multiply  the  equation  by  u  we  obtain,  after  very  simple  manipulations, 


where 


and 


-(u)42  +  (E(u)--(U)2)*+0  =  O, 


F(u)  =  uf(u)  —  J  f(s)ds, 


0  =  V  ( ux )2. 

Since  ©  >  0,  we  immediately  obtain  the  following  entropy  inequality: 

i(W)?  +  (F(u)-^(u)2),<0, 

Now,  we  only  need  to  mimic  the  above  procedure  using  the  numerical  scheme 
(2.7)  instead  of  the  above  parabolic  equation  and  obtain  a  discrete  version  of  the 
above  entropy  inequality.  To  do  that,  we  simply  take  Vh  =  uu  in  (2.7)  and  rearrange 
terms  in  a  suitable  way.  If  we  use  the  following  notation: 

uj+\/2  =  (u/+ 1/2  +Uj+ 1/2)  fit 
Hj  +  1/2  =  (w/+l/2  —  Uj+l/2)> 
the  result  can  be  expressed  as  follows. 


Proposition  6.  We  have,  for  j  =  1, . . . ,  N, 

\  d  f 

2dt  Jj  u^(x’  ■)<**  +  ^j'+1/2  —  Fj-1/2  +  =  0, 

where 

_  /■"»+ 1/2 

Fj+ 1/2  =  Uj+ 1/2  h(uh)j +1/2  ~  f(s)  ds, 


fui+ 1/2  1/2 

@1  =  _  (/(s)  -  h(uh)j+1/2)  ds  +  I  ( f(s)~h{uh)j-1/2)ds . 

Jui  + 1/2  J  Uj  — 1/2 


and 
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Since  the  quantity  Qj  is  nonnegative  (because  the  numerical  flux  in  nondecreas¬ 
ing  in  its  first  argument  and  nonincreasing  in  its  second  argument),  we  immediately 
obtain  the  following  discrete,  local  entropy  inequality: 

2dtf.  “*(*>')***  +  ^j+1/2  -  ^j'-i/2  <  0. 

As  a  consequence,  we  have  the  following  result. 


Theorem  7.  Let  f  be  a  strictly  convex  or  concave  function.  Then,  for  any  k  >  0, 
if  the  numerical  solution  given  by  the  DG  method  converges,  it  converges  to  the 
entropy  solution. 


There  is  no  other  formally  high-order  accurate  numerical  scheme  that  has  the 
above  property.  See  Jiang  and  Shu  [51]  for  further  developments  of  the  above  result. 


2.3  The  TVD-Runge-Kutta  time  discretization 

To  discretize  our  ODE  system  in  time,  we  use  the  TVD  Runge  Kutta  time  dis¬ 
cretization  introduced  in  [83];  see  also  [80]  and  [81]. 


The  discretization  Thus,  if  {tn}n=o  is  a  partition  of  [0,  T]  and  Atn  =  tn+1  — 
tn,  n  =  0, ...,  IV  —  1,  our  time-marching  algorithm  reads  as  follows: 


-  Set  u°h  —  uoh', 

—  For  n  =  0, ...,  N  —  1  compute  uj[+1  from  uf,  as  follows: 

1.  set  u =  Uh', 

2.  for  i  —  1, ...,  k  +  1  compute  the  intermediate  functions: 

3.  set  uJJ+1  = 


Note  that  this  method  is  very  easy  to  code  since  only  a  single  subroutine  defining 
Lh{uh)  is  needed.  Some  Runge-Kutta  time  discretization  parameters  are  displayed 
on  the  table  below. 
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Table  1 


The  stability  property  Note  that  all  the  values  of  the  parameters  an  displayed 
in  the  table  below  are  nonnegative;  this  is  not  an  accident.  Indeed,  this  is  a  condition 
on  the  parameters  an  that  ensures  the  stability  property 

K+1I<KI, 

provided,  that  the  ‘local’  stability  property 

M  <  M>  (2-15) 

where  w  is  obtained  from  v  by  the  following  ‘Euler  forward’  step, 

w  =  v  +  5  Lh(v),  (2-16) 

holds  for  values  of  |  <5 1  smaller  than  a  given  number  So- 

For  example,  the  second-order  Runke-Kutta  method  displayed  in  the  table 
above  can  be  rewritten  as  follows: 

=  «£  +  AtLh{ul), 
wh  =  u 4-  AtLh(u^), 

<+1  =  jW  +  t»0. 

Now,  assuming  that  the  stability  property  (2.15),  (2.16)  is  satisfied  for 
<5o  =  |  At  max{pu/au}  \  =  At, 


we  have 

Iwi15  I  <  l«h  I.  I  Wfc  I  <  I  I, 

and  so, 

K+1l<i(KI  +  l«*l)<  Kl- 

Note  that  we  can  obtain  this  result  because  the  coefficients  an  are  positive!  Runge- 
Kutta  methods  of  this  type  of  order  up  to  order  5  can  be  found  in  [81]. 

The  above  example  shows  how  to  prove  the  following  more  general  result. 
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Theorem  8.  (Stability  of  the  Runge-Kutta  discretization)  Assume  that  the  stabil¬ 
ity  property  for  the  single  ‘Euler  forward’  step  (2.15),  (2.16)  is  satisfied  for 

50  =  max  \Atn  max{f3u/ait}\. 

0<n<N 

Assume  also  that  all  the  coefficients  an  are  nonnegative  and  satisfy  the  following 
condition: 

<-i 

=  1,  *  =  1,  +  1. 

1=0 

Then 

KI<I«U  Vn>0. 


This  stability  property  of  the  TVD-Runge-Kutta  methods  is  crucial  since  it 
allows  us  to  obtain  the  stability  of  the  method  from  the  stability  of  a  single  ‘Euler 
forward’  step. 

Proof  of  Theorem  8.  We  start  by  rewriting  our  time  discretization  as  follows: 

-  Set  =  uoh\ 

-  For  n  =  0, ...,  N  —  1  compute  u£+1  from  as  follows: 

1.  set  =  u£; 

2.  for  i  =  l, k  +  1  compute  the  intermediate  functions: 

i—i 

(i)  (i!) 

uh  '=2w  ai,Wh  ' 

1=0 

where 

(XU 

3.  set  unh+l  =u(£+1). 

We  then  have 

l—l 

I  Ufc5  |  <  ail  I  Whl)  l>  S'nCe  °il  -  °> 

1=0 

<  ]Tai(  l«£°  I. 

1=0 

by  the  stability  property  (2.15),  (2.16),  and  finally, 


since 

i-i 

Yai‘  =  l- 
1=0 

It  is  clear  now  that  that  Theorem  8  follows  from  the  above  inequality  by  a  simple 
induction  argument.  This  concludes  the  proof. 
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Remarks  about  the  stability  in  the  linear  case  For  the  linear  case 
f(u )  =  cu,  Chavent  and  Cockburn  [12]  proved  that  for  the  case  k  =  1,  i.e.,  for 
piecewise-linear  approximate  solutions,  the  single  ‘Euler  forward’  step  is  uncondi¬ 
tionally  L°°(0,T]L2(0,  l))-unstable  for  any  fixed  ratio  At/ Ax.  On  the  other  hand, 
in  [26]  it  was  shown  that  if  a  Runge-Kutta  method  of  second  order  is  used,  the 
scheme  is  L°°(0,T;  L2( 0,  l))-stable  provided  that 


This  means  that  we  cannot  deduce  the  stability  of  the  complete  Runge-Kutta 
method  from  the  stability  of  the  single  ‘Euler  forward’  step.  As  a  consequence, 
we  cannot  apply  Theorem  8  and  we  must  consider  the  complete  method  at  once. 

When  polynomial  of  degree  k  are  used,  a  Runge-Kutta  of  order  ( k  + 1)  must  be 
used.  If  this  is  the  case,  for  k  =  2,  the  L°°(0,T;L2(0,  Instability  condition  can  be 
proven  to  be  the  following: 


At  ^  1 

e-r—  -  r- 

Ax  5 


The  stability  condition  for  a  general  value  of  k  is  still  not  known. 

At  a  first  glance,  this  stability  condition,  also  called  the  Courant-Friedrichs- 
Levy  (CFL)  condition,  seems  to  compare  unfavorably  with  that  of  the  well-known 
finite  difference  schemes.  However,  we  must  remember  that  in  the  DG-methods 
there  are  ( k  +  1)  degrees  of  freedom  in  each  element  of  size  Ax  whereas  for  finite 
difference  schemes  there  is  a  single  degree  of  freedom  of  each  cell  of  size  Ax.  Also, 
if  a  finite  difference  scheme  is  of  order  (k  +  1)  its  so-called  stencil  must  be  of  at 
least  (2k  + 1)  points,  whereas  the  DG-scheme  has  a  stencil  of  ( k  + 1)  elements  only. 


Convergence  analysis  in  the  nonlinear  case  Now,  we  explore  what  is  the 
impact  of  the  explicit  Runge-Kutta  time-discretization  on  the  convergence  prop¬ 
erties  of  the  methods  under  consideration.  We  start  by  considering  the  piecewise- 
constant  case. 

The  piecewise-constant  case.  Let  us  begin  by  considering  the  simplest  case, 
namely, 


V  J  =  1 . JV: 

«+1  -  )/At  +  {h(uj,Uj+ 1)  -  /*(«"_!,  uJ)}/Aj  -  0, 

Uj  (0)  =  -j-  /  uo(x)dx, 

A i  Jij 

where  we  pick  the  numerical  flux  h  to  be  the  Engquist-Osher  flux. 

According  to  the  model  provided  by  the  continuous  case,  we  must  obtain  (i)  an 
entropy  inequality  and  (ii)  the  uniform  boundedness  of  the  total  variation  of  un- 
To  obtain  the  entropy  inequality,  we  proceed  as  in  the  semidiscrete  case  and 
obtain  the  following  result;  see  [18]  for  details. 
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Theorem  9.  (Discrete  entropy  inequality)  We  have 

{U(u]+1  -  c)  -  U(uJ  -  c)}/At  +  {F(«;,«;+1;c)  -  F(u%u  c)}/A, 

+  6jissJ/At  =  0, 

where 

&diss,j  =  [  ’  (Pj(y-j)  -Pj{p))U"(p-  x)dp 

Juj 

+  7T  (/+K-i)  -  /+(p))  U"(p  -  x)dP 
A?  Juj+1 

At  ru?+ 1 

-x/  ,,  (rw+i)-r(p))u"(P-x)dp, 

and 

Pj(w)  =  w-  ~(/+(w)  -  f~(w)). 

Moreover,  if  the  following  GFL  condition  is  satisfied 

At  -  -  i 

m  ax  —  |  /  <  1, 

1<3<N  zij 

then  &2iss,j  ^  0)  and  the  following  entropy  inequality  holds: 

{U(u]+1  -  c)  -  U(u?  -  c)}/At  +  {F{vJj ,  Uj+i;c)  -  uy,  c)}/Aj  <  0. 

Note  that  &2iss,j  -5  0  because  /+,  — are  nondecreasing  and  because  pj  is  also 
nondecreasing  under  the  above  CFL  condition. 

Next,  we  obtain  the  uniform  boundedness  on  the  total  variation.  Proceeding  as 
before,  we  easily  obtain  the  following  result. 

Theorem  10.  (TVD  property)  We  have 

I  Wh+1  |tV(0,1)  —  I  y-h  |tV(0,1)  +  &TV  =  0, 

where 

@TV  —  E  f  U'j+l/2  -  U'j+ 1/2^  {pj+ l/2(w"+l)  ~Pj+ 1/2  (w") 

+  E  ^(^7-1/2- ^1/2)  (/+K)-/+(«^l)) 

1  <j<N  j  V  ' 

-  E  %  (  ^7+1/2  -  u'^/2  )  (/“  W+i)  -  /”(«*)) 


r/  f  <+l  -  \ 

V  A  +  l/2  r 


where 
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and 


Pj+ 1/2  M  = 


At  +.  ,  .  At 


f  (w)  +  -r-f  ( w ). 


Aj+i  J  A 

Moreover,  if  the  following  CFL  condition  is  satisfied 


At  |  ,/ 1  .  1 

i^T*47!  /  '-1, 


t/ien  >  0,  and  we  have 


I  |tv(o,i)  5~  I  «o  |tv(o,i)  • 


With  the  two  above  ingredients,  the  following  error  estimate,  obtained  in  1976 
by  Kuznetsov,  can  be  proved: 


Theorem  11.  (L1-error  estimate  for  monotone  schemes)  We  have 

II  U(T)  -  uh(T)  1 1  x.  r  (o,i)  <  II  Mo  -  Uh(0)  ||£,i(0il)  +C\u0  |rv(0,i  )V¥Ax. 

The  general  case.  The  study  of  the  general  case  is  much  more  difficult  than 
the  study  of  the  monotone  schemes.  In  these  notes,  we  restrict  ourselves  to  the 
study  of  the  stability  of  the  RKDG  schemes.  Hence,  we  restrict  ourselves  to  the 
task  of  studying  under  what  conditions  the  total  variation  of  the  local  means  is 
uniformly  bounded. 

If  we  denote  by  Ti,  the  mean  of  Uh  on  the  interval  Ij ,  by  setting  vh  =  1  in  the 
equation  (2.7),  we  obtain, 

Vi  =  l,-.,iV: 

{uj)t  +  i/2>Mj+i/2)  —  — 1/2>  ^-1/2)} /Aj  =  0, 

where  uJ+1/2  denotes  the  limit  from  the  left  and  uf+1^2  the  limit  from  the  right. 
We  pick  the  numerical  flux  h  to  be  the  Engquist-Osher  flux. 

This  shows  that  if  we  set  w/t  equal  to  the  Euler  forward  step  Uh  +  S  Lh(uh ),  we 
obtain 

V  j  =  l,...,N: 

(wj  —  Uj  )/5  +  {h{Uj+1,2iUj+1/2)  —  h(v,j_^i2 , a, _  1/2) } /Aj  —  0. 

Proceeding  exactly  as  in  the  piecewise-constant  case,  we  obtain  the  following  result 
for  the  total  variation  of  the  averages, 

I  Uh  |tv(o,i)  =  I  “i+1  —  “i  I- 

1  <j<N 


Theorem  12.  (The  TVDM  property)  We  have 

I  U)h  |tV(0,1)  —  |  Uh  ItV(O.I)  +  &TVM  —  0, 
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where 

&TVM 


■ 


E 

1  <j<N 


E 

1  <j<N 


E 

i<j<N 


(u)j+ 1/2  U  (m)j+1/2  ^  (Pj+l/2(tlh|/J  +  ]  )  Pj+l/2(M/l|/,- ) 

j:  (^'H-1/2  -  ^K+i/2)  (/+(^7+i/2)  -  /+K“-l/2)) 

—  (^U  (w)J+1y2  —  17  H^j/2  ^  (/  (wt+1/,2)  —  /  (i*E/2))’ 


where 


and 


U'(v) 


*+1/2 


Pj+1/2  (WA  |/m  )  —  u** 


Jj+1 


/+(Wm+l/2)+  A  7  (Um-l/2)' 


Prom  the  above  result,  we  see  that  the  total  variation  of  the  means  of  the  Euler 
forward  step  is  nonincreasing  if  the  following  sign  conditions  are  satisfied: 

sgn(uj+i  -Uj)  =  sgn(pj+1/2{uh\ij+1)  -Pj+i/2(w/l|/J.) ),  (2.17) 

sgn{uj  -  Uj- 1 )  =  sgn(u^1/2  -  u"'S1/2  ),  (2.18) 

sgn(uj+i  -uj)  =  sgn(u"£/2  -  u":+/2 ).  (2.19) 

Note  that  if  the  sign  conditions  (2.17)  and  (2.18)  are  satisfied,  then  the  sign  condi¬ 
tion  (2.19)  can  always  be  satisfied  for  a  small  enough  values  of  1 5 1. 

Of  course,  the  numerical  method  under  consideration  does  not  provide  an  ap¬ 
proximate  solution  automatically  satisfying  the  above  conditions.  It  is  thus  nec¬ 
essary  to  enforce  them  by  means  of  a  suitably  defined  generalized  slope  limiter, 

Anh. 


2.4  The  generalized  slope  limiter 

High-order  accuracy  versus  the  TVDM  property:  Heuristics  The 

ideal  generalized  slope  limiter  Allh 

-  Maintains  the  conservation  of  mass  element  by  element, 

-  Satisfies  the  sign  properties  (2.17),  (2.18),  and  (2.19), 

-  Does  not  degrade  the  accuracy  of  the  method. 

The  first  requirement  simply  states  that  the  slope  limiting  must  not  change  the 
total  mass  contained  in  each  interval,  that  is,  if  u h  =  AIIh{vh), 

Uj-vj,  j  —  1, ,  N. 

This  is,  of  course  a  very  sensible  requirement  because  after  all  we  are  dealing  with 
conservation  laws.  It  is  also  a  requirement  very  easy  to  satisfy. 
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The  second  requirement,  states  that  if  Uh  —  AIIh(vh)  and  Wh  =  Uh  +  S  Lh(uh) 
then 


I  Wh  |tv(o,i)  <  |  Uh  |tv(o,i)> 

for  small  enough  values  of  |  8  \ . 

The  third  requirement  deserves  a  more  delicate  discussion.  Note  that  if  Uh  is  a 
very  good  approximation  of  a  smooth  solution  u  in  a  neighborhood  of  the  point  xo, 
it  behaves  (asymptotically  as  Ax  goes  to  zero)  as  a  straight  line  if  ux  (xo)  ^  0.  If  xo  is 
an  isolated  extrema  of  u ,  then  it  behaves  like  a  parabola  provided  uxx(xq)  ^  0.  Now, 
if  Uh  is  a  straight  line,  it  trivially  satisfies  conditions  (2.17)  and  (2.18).  However,  if 
Uh  is  a  parabola,  conditions  (2.17)  and  (2.18)  are  not  always  satisfied.  This  shows 
that  it  is  impossible  to  construct  the  above  ideal  generalized  ‘slope  limiter,1  or,  in 
other  words,  that  in  order  to  enforce  the  TVDM  property,  we  must  loose  high- 
order  accuracy  at  the  local  extrema.  This  is  a  very  well-known  phenomenon  for 
TVD  finite  difference  schemes! 

Fortunately,  it  is  still  possible  to  construct  generalized  slope  limiters  that  do 
preserve  high-order  accuracy  even  at  local  extrema.  The  resulting  scheme  will  then 
not  be  TVDM  but  total  variation  bounded  in  the  means  (TVBM)  as  we  will  show. 

In  what  follows  we  first  consider  generalized  slope  limiters  that  render  the 
RKDG  schemes  TVDM.  Then  we  suitably  modify  them  in  order  to  obtain  TVBM 
schemes. 


Constructing  TVDM  generalized  slope  limiters  Next,  we  look  for  simple, 
sufficient  conditions  on  the  function  Uh  that  imply  the  sign  properties  (2.17),  (2.18), 
and  (2.19).  These  conditions  will  be  stated  in  terms  of  the  minmod  function  m 
defined  as  follows: 


771  (dl ,  .  .  .  ,  Uii ) 


s  mini<n<„  |  an  \  if  s  =  sign(ai)  =  ■■■  =  sign(a„), 
0  otherwise. 


Proposition  13.  Sufficient  conditions  for  the  sign  properties  Suppose  the  the  fol¬ 
lowing  CFL  condition  is  satisfied: 

For  all  j  =  1, . . .  ,N  : 

I  6  I  ( - 4~  ^  <  1/2.  (2.20) 

Aj+i  Aj 

Then,  conditions  (2.17),  (2.18),  and  (2.19)  are  satisfied  if,  for  all  j  =T, . . .  ,N,  we 
have  that 

uj+ 1/2  =  Uj  +  777.  (  —  Uj  ,  Uj  —  Uj- 1 ,  Uj+l  —  Uj)  (2.21) 

Uj^— 1/2  =  Uj  —  771  (wj  —  7tj_i/2>  Uj  —  Uj- 1,  Ujj- 1  —  Uj). 

Proof.  Let  us  start  by  showing  that  the  property  (2.18)  is  satisfied.  We  have: 

^'+1/2  —  Uj- 1/2  =  (Uj  + 1/2  ~  Uj)  +  (Uj  —  Uj- 1)  +  (Uj-\  —  Uj_1/2) 

—  ®  (Uj  Uj  —  1 ) , 
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where 


0  =  1  + 


uj+ 1/2  ui 

Uj  “  Uj  —  i 


Uj —1/2 

u3  1 


6  [0,  2], 


by  conditions  (2.21)  and  (2.22).  This  implies  that  the  property  (2.18)  is  satisfied. 
Properties  (2.19)  and  (2.17)  are  proven  in  a  similar  way.  This  completes  the  proof. 


Examples  of  TVDM  generalized  slope  limiters 

a.  The  MUSCL  limiter.  In  the  case  of  piecewise  linear  approximate  solutions, 
that  is, 

vh  |j5-  =  Vj  +  {x-  Xj)  vXlj,  j  =  1, ...  ,1V, 
the  following  generalized  slope  limiter  does  satisfy  the  conditions  (2.21)  and  (2.22): 


uh\n 


+  (x  —  Xj)  m  (vx,j 


Vj+ 1  ~  Vj  Vj 


A,- 


±). 


This  is  the  well-known  slope  limiter  of  the  MUSCL  schemes  of  van  Leer  [88,89]. 

b.  The  less  restrictive  limiter  All l.  The  following  less  restrictive  slope 
limiter  also  satisfies  the  conditions  (2.21)  and  (2.22): 


Uh\ij  =  Vj  +  (x-  Xj)  m  (vx,j, 


Vj+ 1  ~  Vj 

Aj/2 


vi  ~  vJ-i ) 
Aj/2  > 


Moreover,  it  can  be  rewritten  as  follows: 


Uj+ 1/2  =vj  +m(vj  +  l/2  ~Vh  v3  ~Vj- 1.  Vj+1  ~Vj) 
uf- 1/2  =Vj—m  (  Vj  -  v+_1/2>  Vj  -  Vj- 1,  Vj+1  -  Vj). 

We  denote  this  limiter  by  ATI\. 

Note  that  we  have  that 

1  /\  'T 

II  Vh  -  Anh(vh)  ||L1(0,1)  <  —  I  Vh  |tv(0,i)- 


(2.22) 


See  Theorem  16  below. 

c.  The  limiter  AII^.  In  the  case  in  which  the  approximate  solution  is  piecewise 
a  polynomial  of  degree  k,  that  is,  when 

k 

Vh(x,t)  =  Vj(pi(x), 
e=o 


where 


<fie{x)  =  Pe{ 2  (a:  -  Xj)/Aj), 

and  Pi  are  the  Legendre  polynomials,  we  can  define  a  generalized  slope  limiter  in  a 
very  simple  way.  To  do  that,  we  need  the  define  what  could  be  called  the  P'-part 
of  vh: 

l 

vl(x,t)  =  Vj<p/(x), 

0 


We  define  Uh  =  AIIh(vh)  as  follows: 
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-  For  j  —  1, TV  compute  Uh\ij  as  follows: 

1.  Compute  mJ+1/2  and  ut-i/2  by  using  (2.22)  and  (2.23), 

2.  If  «7+1/2  =  v~+1/2  and  u+_1/2  =  vt_1/2  set  =  vh\ijt 

3.  If  not,  take  Uh\ij  equal  to  AIll(v\). 

d.  The  limiter  An*a.  When  instead  of  (2.22)  and  (2.23),  we  use 

ui+ 1/2  =  vi  m  (vj+i/2  ~vj- 1>  vj+ 1  —  Vj  ,C  (Ax)  )  (2.23) 

1/2  =  Vi  ~  m(vj  —  ut_i/2,  wj  -  Wj-i,  tJj+i  -  Vj,  C  (Ac)a), 

for  some  fixed  constant  C  and  a  €  (0, 1),  we  obtain  a  generalized  slope  limiter  we 
denote  by  AIJk^a. 

This  generalized  slope  limiter  is  never  used  in  practice,  but  we  consider  it  here 
because  it  is  used  for  theoretical  purposes;  see  Theorem  16  below. 

The  complete  RKDG  method  Now  that  we  have  our  generalized  slope  lim¬ 
iters,  we  can  display  the  complete  RKDG  method.  It  is  contained  in  the  following 
algorithm: 

-  Set  u°h  =  Allh  Pvh(u o); 

-  For  n  =  0, ...,  JV  —  1  compute  u£+1  as  follows: 

1.  set  u(h0)  =  ttj}; 

2.  for  i  —  1, ...,  k  +  1  compute  the  intermediate  functions: 

=  A nh  ui°  +  0uAtnLh(u%)) | ; 

3.  set  u%+l  =  u{k+1). 

This  algorithm  describes  the  complete  RKDG  method.  Note  how  the  generalized 
slope  limiter  has  to  be  applied  at  each  intermediate  computation  of  the  Runge- 
Kutta  method.  This  way  of  applying  the  generalized  slope  limiter  in  the  time¬ 
marching  algorithm  ensures  that  the  scheme  is  TVDM,  as  we  next  show. 

The  TVDM  property  of  the  RKDG  method  To  do  that,  we  start  by 
noting  that  if  we  set 

uh  =  AIIh{vh),  wh  =  uh  +  6  Lh(uh), 

then  we  have  that 

I  Uh,  |tv(o,i)  <  I  Vh  |tv(o,i)j  (2.24) 

I  |tv(o,i)  <  |tv(o,i)>  V|J|  <  6o,  (2.25) 

where 

«50-1=max(2^^  +  ^-^)  j  = 
i  Aj+ 1  Aj 

by  Proposition  13.  By  using  the  above  two  properties  of  the  generalized  slope  lim¬ 
iter,’  it  is  possible  to  show  that  the  RKDG  method  is  TVDM. 
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Fig.  2.1.  Example  of  slope  limiters:  The  MUSCL  limiter  (top)  and  the  less  restric¬ 
tive  Alii  limiter  (bottom).  Displayed  are  the  local  means  of  u h  (thick  line),  the 
linear  function  u h  in  the  element  of  the  middle  before  limiting  (dotted  line)  and 
the  resulting  function  after  limiting  (solid  line) . 
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Theorem  14.  (Stability  induced  by  the  generalized  slope  limiter)  Assume  that 
the  generalized  slope  limiter  All \  satisfies  the  properties  (2.24)  and  (2.25).  Assume 
also  that  all  the  coefficients  an  are  nonnegative  and  satisfy  the  following  condition: 

i—1 

2>,  =  i,  *  =  l, . . . ,  fc  + 1. 

1=0 

Then 

I  w£  |tv(o,i)  <  I  wo  |tv(o,i)>  Vn>  0. 

Proof.  The  proof  of  this  result  is  very  similar  to  that  of  Theorem  8.  Thus,  we 
start  by  rewriting  our  time  discretization  as  follows: 

-  Set  u°h  =  u0h ; 

—  For  n  =  0, N  —  1  compute  «£+1  from  as  follows: 

1.  set  tt^0)  =  u£; 

2.  for  i  =  1, k  +  1  compute  the  intermediate  functions: 

*4°  =  Anh  , 

where 

(XU 

3.  set  u%+1  =  u[k+1) . 

Then  have, 

i-i 

|^il)  |tv(o,i)  <  I  |tv(o,i)j  by  (2.24), 

i=o 

i-i 

<  lrv(0,i)j  since  au  >  0, 

i=o 

t-i 

<  I  Y. ail  Itv(o,i)>  by  (2.25), 

1=0 

<  I  TV (0,1)) 

since 

i  —  l 

HQi!  =  L 
1=0 

It  is  clear  now  that  that  the  inequality 

\uh  |rv(o,i)  <  |w£  |tv(o,i))  Vti>0. 

follows  from  the  above  inequality  by  a  simple  induction  argument.  To  obtain  the 
result  of  the  theorem,  it  is  enough  to  note  that  we  have 

|  Wft  |tV(0,1)  <  |  Wo  |tV(0,1)> 

by  the  definition  of  the  initial  condition  u°.  This  completes  the  proof. 
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TVBM  generalized  slope  limiters  As  was  pointed  out  before,  it  is  possible 
to  modify  the  generalized  slope  limiters  displayed  in  the  examples  above  in  such  a 
way  that  the  degradation  of  the  accuracy  at  local  extrema  is  avoided.  To  achieve 
this,  we  follow  Shu  [82]  and  modify  the  definition  of  the  generalized  slope  limiters  by 
simply  replacing  the  minmod  function  m  by  the  TVB  corrected  minmod  function 
fh  defined  as  follows: 


Tfl  (d  1 ,  dm) 


jai  if  |ai  |  <  M(Ax)2, 

[m  (ai, am)  otherwise, 


(2.26) 


where  M  is  a  given  constant.  We  call  the  generalized  slope  limiters  thus  constructed, 
TVBM  slope  limiters. 

The  constant  M  is,  of  course,  an  upper  bound  of  the  absolute  value  of  the 
second-order  derivative  of  the  solution  at  local  extrema.  In  the  case  of  the  nonlinear 
conservation  laws  under  consideration,  it  is  easy  to  see  that,  if  the  initial  data  is 
piecewise  C 2,  we  can  take 


M  =  sup{  |  (u0)xx(y)  |,  y  :  (u0 )x(y)  =  0}. 

See  [24]  for  other  choices  of  M. 

Thus,  if  the  constant  M  is  is  taken  as  above,  there  is  no  degeneracy  of  accu¬ 
racy  at  the  extrema  and  the  resulting  RKDG  scheme  retains  its  optimal  accuracy. 
Moreover,  we  have  the  following  stability  result. 


Theorem  15.  (The  TVBM  property)  Assume  that  the  generalized  slope  limiter 
Allh  is  a  TVBM  slope  limiter.  Assume  also  that  all  the  coefficients  an  are  nonneg¬ 
ative  and  satisfy  the  following  condition: 

i- 1 

=  i,  *  =  i,  1. 

1=0 

Then 

I  uh  |TV(0,1)  <  |  WO  1 7V (0,1)  +  C  M,  Vli  >  0, 
where  C  depends  on  k  only. 

Convergence  in  the  nonlinear  case  By  using  the  stability  above  stability 
results,  we  can  use  the  Ascoli-Arzela  theorem  to  prove  the  following  convergence 
result. 


Theorem  16.  (Convergence  to  the  entropy  solution)  Assume  that  the  generalized 
slope  limiter  Allh  is  a  TVDM  or  a  TVBM  slope  limiter.  Assume  also  that  all  the 
coefficients  an  are  nonnegative  and  satisfy  the  following  condition: 

»— l 

^ an  =  1,  t  =  1,  + 1. 

1=0 

Then  there  is  a  subsequence  {uh'}h<>o  °f  the  sequence  {uh}h>o  generate  by  the 
RKDG  scheme  that  converges  in  L°°(0,T;  L1{ 0, 1))  to  a  weak  solution  of  the  problem 
(2.1),  (2.2). 
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Moreover ,  if  the  TVBM  version  of  the  slope  limiter  AII^  a  is  used,  the  weak 
solution  is  the  entropy  solution  and  the  whole  sequence  converges. 

Finally ,  if  the  generalized  slope  limiter  Allh  is  such  that 

II  Vh  -  Anh{vh)  ||z,i (0,1)  <  C  Ax  \  vh  |tv(o,i), 

then  the  above  results  hold  not  only  to  the  sequence  of  the  means  {ufi}h>o  but  to 
the  sequence  of  the  functions  {uh}h> o- 


Error  estimates  for  an  implicit  version  of  the  discontinuous  Galerkin  method 
(with  the  so-called  shock-capturing  terms)  have  been  obtained  by  Cockburn  and 
Gremaud  [19]. 


2.5  Computational  results 

In  this  section,  we  display  the  performance  of  the  RKDG  schemes  in  two  simple 
but  typical  test  problems.  We  use  piecewise  linear  ( k  =  1)  and  piecewise  quadratic 
( k  =  2)  elements;  the  Allf;  generalized  slope  limiter  is  used. 

The  first  test  problem.  We  consider  the  simple  transport  equation  with 
periodic  boundary  conditions: 


Ut+Ux  =  0, 
«(*>°)  = 


•4  <  x  <  .6, 
otherwise. 


We  use  this  test  problem  to  show  that  the  use  of  high-order  polynomial  ap¬ 
proximation  does  improve  the  approximation  of  the  discontinuities  (or,  in  this  case, 
‘contacts’).  To  amplify  the  effect  of  the  dissipation  of  the  method,  we  take  T  =  100, 
that  is,  we  let  the  solution  travel  100  times  across  the  domain.  We  run  the  scheme 
with  CFL  =  0.9  *  1  =  0.9  for  k  =  0,  CFL  =  0.9  *  1/3  =  0.3  for  k  =  1,  and 
CFL  =  0.9  *1/5  =  0.18  for  k  =  2.  In  Figure  2.2,  we  can  see  that  the  dissipation 
effect  decreases  as  the  degree  of  the  polynomial  k  increases;  we  also  see  that  the 
dissipation  effect  for  a  given  k  decreases  as  the  Ax  decreases,  as  expected.  Other 
experiments  in  this  direction  have  been  performed  by  Atkins  and  Shu  [1],  For  ex¬ 
ample,  they  show  that  when  polynomials  of  degree  k  =  11  are  used,  there  is  no 
detectable  decay  of  the  approximate  solution. 

To  assess  if  the  use  of  high  degree  polynomials  is  advantageous,  we  must  compare 
the  efficiencies  of  the  schemes;  we  only  compare  the  efficiencies  of  the  method  for 
k  =  1  and  k  =  2.  We  define  the  inverse  of  the  efficiency  of  the  method  as  the 
product  of  the  error  times  the  number  of  operations.  Since  the  RKDG  method  that 
uses  quadratic  elements  has  0.3/0. 2  times  more  time  steps,  3/2  times  more  inner 
iterations  per  time  step,  and  3  x  3/2  x  2  times  more  operations  per  element,  its 
number  of  operations  is  81/16  times  bigger  than  the  one  of  the  RKDG  method 
using  linear  elements.  Hence,  the  ratio  of  the  efficiency  of  the  RKDG  method  with 
quadratic  elements  to  that  of  the  RKDG  method  with  linear  elements  is 

,,  ±.  16  error(RKDG(k  =  1) 

eff. ratio  -  error(RKDG(k  =  2) ' 
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In  Table  2,  we  see  that  the  use  of  a  higher  degree  does  result  in  a  more  efficient 
resolution  of  the  contact  discontinuities.  This  fact  remains  true  for  systems  as  we 
can  see  from  the  numerical  experiments  for  the  double  Mach  reflection  problem  in 
the  next  chapter. 

The  second  test  problem.  We  consider  the  standard  Burgers  equation  with 
periodic  boundary  conditions: 

,u2  . 

ut  +  (~2~  )x  —  0, 

u(x,  0)  =  Uo(x)  =  i  i  sin(7r(2x  —  1)). 

Our  purpose  is  to  show  that  (i)  when  the  constant  M  is  properly  chosen,  the 
RKDG  method  using  polynomials  of  degree  k  is  is  order  k  +  1  in  the  uniform  norm 
away  from  the  discontinuities,  that  (ii)  it  is  computationally  more  efficient  to  use 
high-degree  polynomial  approximations,  and  that  (iii)  shocks  are  captured  in  a  few 
elements  without  production  of  spurious  oscillations 

The  exact  solution  is  smooth  at  T  =  .05  and  has  a  well  developed  shock  at 
T  —  0.4;  notice  that  there  is  a  sonic  point.  In  Tables  3,4,  and  5,  the  history  of 
convergence  of  the  RKDG  method  using  piecewise  linear  elements  is  displayed  and 
in  Tables  6,7,  and  8,  the  history  of  convergence  of  the  RKDG  method  using  piecewise 
quadratic  elements.  It  can  be  seen  that  when  the  TVDM  generalized  slope  limiter 
is  used,  i.e.,  when  we  take  M  =  0,  there  is  degradation  of  the  accuracy  of  the 
scheme,  whereas  when  the  TVBM  generalized  slope  limiter  is  used  with  a  properly 
chosen  constant  M,  i.e.,  when  M  —  20  >  2 7r2,  the  scheme  is  uniformly  high  order 
in  regions  of  smoothness  that  include  critical  and  sonic  points. 

Next,  we  compare  the  efficiency  of  the  RKDG  schemes  for  k  =  1  and  k  —  2 
for  the  case  M  =  20  and  T  =  0.05.  The  results  are  displayed  in  Table  9.  We  can 
see  that  the  efficiency  of  the  RKDG  scheme  with  quadratic  polynomials  is  several 
times  that  of  the  RKDG  scheme  with  linear  polynomials  even  for  very  small  values 
of  Ax.  We  can  also  see  that  the  efficiency  ratio  is  proportional  to  (zlx)-1,  which 
is  expected  for  smooth  solutions.  This  indicates  that  it  is  indeed  more  efficient  to 
work  with  RKDG  methods  using  polynomials  of  higher  degree. 

That  this  is  also  true  when  the  solution  displays  shocks  can  be  seen  in  Figures 
2.3,  2.4,  and  2.5.  In  the  Figure  2.3,  it  can  be  seen  that  the  shock  is  captured  in 
essentially  two  elements.  Details  of  these  figures  are  shown  in  Figures  2.4  and  2.5, 
where  the  approximations  right  in  front  of  the  shock  are  shown.  It  is  clear  that 
the  approximation  using  quadratic  elements  is  superior  to  the  approximation  using 
linear  elements.  Finally,  we  illustrate  in  Figure  2.6  how  the  schemes  follow  a  shock 
when  it  goes  through  a  single  element. 

2.6  Concluding  remarks 

In  this  section,  which  is  the  core  of  these  notes,  we  have  devised  the  general  RKDG 
method  for  nonlinear  scalar  conservation  laws  with  periodic  boundary  conditions. 

We  have  seen  that  the  RKDG  are  constructed  in  three  steps.  First,  the  Discon¬ 
tinuous  Galerkin  method  is  used  to  discretize  in  space  the  conservation  law.  Then, 
an  explicit  TVB-Runge-Kutta  time  discretization  is  used  to  discretize  the  result¬ 
ing  ODE  system.  Finally,  a  generalized  slope  limiter  is  introduced  that  enforces 
nonlinear  stability  without  degrading  the  accuracy  of  the  scheme. 
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We  have  seen  that  the  numerical  results  show  that  the  RKDG  methods  using 
polynomials  of  degree  k,k  —  1,2  are  uniformly  (k  +  l)-th  order  accurate  away 
from  discontinuities  and  that  the  use  of  high  degree  polynomials  render  the  RKDG 
method  more  efficient,  even  close  to  discontinuities. 

All  these  results  can  be  extended  to  the  initial  boundary  value  problem  in 
a  very  simple  way,  see  [24],  In  what  follows,  we  extend  the  RKDG  methods  to 
multidimensional  systems. 


Table  2 

Comparison  of  the  efficiencies  of  RKDG  schemes  for  k  =  1  and  k  =  2 
Transport  equation  with  M  —  0,  and  T  =  100. 
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Table  3 

M  =  0,  CFL=  0.3,  T  =  0.05. 


Ll{  0, 1)  —  error 

105  •  error 

order 

1286.23 

334.93 

1.85 

85.32 

1.97 

21.64 

1.98 

5.49 

1.98 

1.37 

2.00 

0.34 

2.01 

0.08 

2.02 

3491.79 

1129.21 

449.29 

137.30 
45.10 
14.79 
4.85 
1.60 


Table  4 

P\  M  =  20,  CFL=  0.3,  T  =  0.05. 


Lx(  0, 1)  —  error 

105  •  error 

order 

1073.58 

277.38 

1.95 

71.92 

1.95 

18.77 

1.94 

4.79 

1.97 

1.21 

1.99 

0.30 

2.00 

0.08 

2.00 

L°°(0, 1)  —  error 


10“  •  error  order 


2406.38 

628.12 

161.65 

42.30 

10.71 

2.82 

0.78 

0.21 
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Fig.  2.2.  Comparison  of  the  exact  and  the  approximate  solutions  for  the  linear  case 
f(u)  =  u.  Top:  Ax  =  1/40,  middle:  Ax  =  1/80,  bottom:  Ax  =  1/160.  Exact  so¬ 
lution  (solid  line),  piecewise-constant  elements  (dash/dotted  line),  piecewise-linear 
elements  (dotted  line)  and  piecewise- quadratic  elements  (dashed  line). 
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Table  5 

Errors  in  smooth  region  £2  =  {x  :  |x  —  shock]  >  0.1}. 
P1,  M  =  20,  CFL=  0.3,  T  =  0.4. 


Ll(n)  —  error 

L°°({2)  —  error 

Ax 

10s  ■  error 

order 

106  •  error 

order 

1/10 

1477.16 

17027.32 

155.67 

3.25 

1088.55 

3.97 

■ 

38.35 

2.02 

247.35 

2.14 

,:1EE  ^ 1 

9.70 

1.98 

65.30 

1.92 

2.44 

1.99 

17.35 

1.91 

BE 

0.61 

1.99 

4.48 

1.95 

0.15 

2.00 

1.14 

1.98 

1/1280 

0.04 

2.00 

0.29 

1.99 

Table  6 

P2,  M  =  0,  CFL=  0.2,  T  =  0.05. 


L^O,  1)  —  error 

L°°(0, 1)  —  error 

Ax 

105  •  error 

order 

105  •  error 

order 

1/10 

2066.13 

16910.05 

1/20 

251.79 

3.03 

3014.64 

2.49 

1/40 

42.52 

2.57 

1032.53 

1.55 

1/80 

7.56 

2.49 

336.62 

1.61 
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Table  7 

M  =  20.  CFL=  0.2,  T  =  0.05. 


L1(0, 1)  —  error 

L°°(0, 1)  —  error 

105  •  error 

order 

105  ■  error 

order 

37.31 

■n 

101.44 

4.58 

■ 

13.50 

2.91 

0.55 

1.52 

3.15 

0.07 

3.08 

0.19 

3.01 

Table  8 

Errors  in  smooth  region  Q  =  {x  :  \x  —  shock\  >  0.1} 
P2,  M  =  20,  CFL=  0.2,  T  =  0.4. 


L1(12)  —  error 

L°°(Q)  —  error 

105  •  error 

order 

105  •  error 

order 

786.36 

16413.79 

5.52 

7.16 

86.01 

7.58 

0.36 

3.94 

15.49 

2.47 

0.06 

2.48 

0.54 

4.84 
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Table  9 

Comparison  of  the  efficiencies  of  RKDG  schemes  for  k  —  1  and  k  =  2 
Burgers  equation  with  M  —  20,  and  T  =  0.05. 


L1-norm 

L°°-norm 

Ax 

eff.  ratio 

order 

eff. ratio 

order 

1/10 

5.68 

_ 

4.69 

_ 

1/20 

11.96 

-1.07 

31.02 

-2.73 

1/40 

25.83 

-1.11 

70.90 

-1.19 

1/80 

52.97 

-1.04 

148.42 

-1.07 

2.7  Appendix:  Proof  of  the  L2-error  estimates 

Proof  of  the  Instability  In  this  section,  we  prove  the  the  stability  result  of 
Proposition  1.  To  do  that,  we  first  show  how  to  obtain  the  corresponding  stability 
result  for  the  exact  solution  and  then  mimic  the  argument  to  obtain  Proposition  1. 

The  continuous  case  as  a  model.  We  start  by  rewriting  the  equations  (2.4) 
in  compact  form.  If  in  the  equations  (2.4)  we  replace  v(x)  by  v(x,t),  sum  on  j  from 
1  to  N,  and  integrate  in  time  from  0  to  T,  we  obtain 

V  v  :  v(t)  is  smooth  V  t  6  (0,  T)  : 

B(u,u)  =  0,  (2.27) 


where 


B(u,v)  =  /  /  {  dtu(x,  t)  v(x,  t)  —  cu(x,  t)  dx  v(x,  t)  }  dxdt. 

Jo  Jo 

Taking  v  =  u,  we  easily  see  that  we  see  that 


B(u,  U)  =  ^11  U(T)  llz,2(0,l)  -  ^11  U0  111,2(0,!), 


and  since 


B(u,u)  =  0, 

by  (2.27),  we  immediately  obtain  the  following  L2-stability  result: 

2 II  U(T)  lll,2(o,i)  =  2 II  u°  lll2(o,i)- 


This  is  the  argument  we  have  to  mimic  in  order  to  prove  Proposition  1. 
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The  discrete  case.  Thus,  we  start  by  finding  the  discrete  version  of  the  form 
B(-,  •).  If  we  replace  v(x)  by  Vh (x,  t)  in  the  equation  (2.7),  sum  on  j  from  1  to  N, 
and  integrate  in  time  from  0  to  T,  we  obtain 

V  vh  :  vh(t)  e  V£  Vte(0,T): 

Bh(uh,Vh)  =  0,  (2.28) 


where 


Bh{uh,Vh)  =  /  /  dtUh(x,t)vh(x,t)dxdt 

Jo  Jo 

—  I  Yj  i  cuh(x,t)dxVh(x,t)dxdt 

r T 

~  Y  h(Uh)j+l/2{t)[Mt)]j+l/2dt. 

1  <j<N 


(2.29) 


Following  the  model  provided  by  the  continuous  case,  we  next  obtain  an  ex¬ 
pression  for  Bh(wh,Wh )•  It  is  contained  in  the  following  result  which  will  proved 
later. 


Lemma  17.  We  have 

Bh(wh,wh)  =  i||  Wh(T)  111,2(0,1)  +  &r{wh )  -  |||  wh(0)  1112(0,!), 

where 

&r{wh)=  Ei<j<iv  [wh(t)]2j+1/2dt. 


Taking  Wh  =  uu  in  the  above  result  and  noting  that  by  (2.28), 

Bh{uh,uh )  =  0, 


we  get  the  equality 

2II  uh(T)  II i,2 (0,1)  "h  ®t{uh)  —  2  II  UA.(0)  Ili2  (0,1) > 
from  which  Proposition  1  easily  follows,  since 

2 II  uh(T)  Hl2(o,i)  <  2  II  'u°  llr^2 (0,1.) > 

by  (2.8).  It  only  remains  to  prove  Lemma  17. 

Proof  of  Lemma  17.  After  setting  un  —  Vh  =  wu  in  the  definition  of  Bh, 
(2.29),  we  get 

1  1 

Bh{wh,wh)  =  j II  ^(T)  111,2(0,1)  +  J  &diss{t)dt-~ ||  Mfc(°)  ||l,2(0,i), 

where 

@dUs(t)~-  Y  \  h(Wh)j+l/2(t)[wh(t)}j+l/2+  cwh(x,t)dxwh(x,t)dx  y 
1  <j<N  *•  •'h  ' 
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We  only  have  to  show  that  /0T  &diSS{t)dt  =  &r{wh)-  To  do  that,  we  proceed  as 
follows.  Dropping  the  dependence  on  the  variable  t  and  setting 

wh(xj+ 1/2)  =  i (wh(xJ+1/2)  +  wh(xf+1/2) ), 

we  have,  by  the  definition  of  the  flux  h ,  (2.11), 

-  E  f  2[»fc]j+l/2=-  E  {ctUh  [»h]  -  [Wh]2}j+ 1/2, 

1  <j<NJli  1  <i<W 

and 

-  E  /  cwh{x)dxwh(x)dx=  |  E  [Wh]j+l/2 

l<j<NJli  z  l<j<N 

=  C  £  {«7fc[toh]}i+1/2. 
l<i<iv 

Hence 

6»rfiss(t)  =  ^  E  [“kW]?+i/2, 

l<j<N 

and  the  result  follows.  This  completes  the  proof  of  Lemma  17. 

This  completes  the  proof  of  Proposition  1. 


Proof  of  Theorem  2  In  this  section,  we  prove  the  error  estimate  of  Theorem  2 
which  holds  for  the  linear  case  f(u)  —  cu.  To  do  that,  we  first  show  how  to  estimate 
the  error  between  the  solutions  w„  =  u  =  1,2,  of 

dt  u„  +  dx  /(«„)  =  0  in  (0,T)  x  (0, 1), 
w„(t  =  0)  =  uo,i/,  on  (0,1). 

Then,  we  mimic  the  argument  in  order  to  prove  Theorem  2. 

The  continuous  case  as  a  model.  By  the  definition  of  the  form  B(-,  •),  (2.7), 
we  have,  for  v  =  1, 2, 

B(w„,v)  =0,  Vo:  v(t)  is  smooth  V  t  6  (0,T). 

Since  the  form  B(-,-)  is  bilinear,  from  the  above  equation  we  obtain  the  so-called 
error  equation: 


V  v  :  v(t)  is  smooth  V  t  6  (0,  T)  : 

B(e,v)  =  0,  (2.30) 

where  e  =  w\  —  W2-  Now,  since 

B(e,e)  =  i||e(T)||i2(Oil)-i||e(0)||2t2(Oil), 

and 


B(e,e)  =  0, 
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by  the  error  equation  (2.30),  we  immediately  obtain  the  error  estimate  we  sought: 

2  II  e(r)  III/2 (0,1)  =  ^11  “0,1  -  “0,2  ||t2(0ll). 

To  prove  Theorem  2,  we  only  need  to  obtain  a  discrete  version  of  this  argument. 


The  discrete  case.  Since, 

Bh(uh,Vh )  =  0, 

V  vh  :  v(t)  e  Vh 

v  t  £  (0,T), 

Bh(u,vh)  =  0, 

V  Vh  '■  Vh{t)  6  Vh 

v  t  e  (o,T), 

by  (2.7)  and  by  equations  (2.4),  respectively,  we  easily  obtain  our  error  equation: 


V  vh  :  vh(t)  €  14  V  t  €  (0,  T)  : 

Bh{e,vh)=0,  (2.31) 


where  e  =  w  —  Wh- 

Now,  according  to  the  continuous  case  argument,  we  should  consider  next  the 
quantity  Bh(e,  e);  however,  since  e(t)  is  not  in  the  finite  element  space  14,  it  is 
more  convenient  to  consider  Bh(Ph(e),  Ph(e)),  where  Ph{e(t))  is  the  L2-projection 
of  the  error  e(f)  into  the  finite  element  space  . 

The  L2-projection  of  the  function  p  6  L2(0, 1)  into  14,  Phip),  is  defined  as  the 
only  element  of  the  finite  element  space  14  such  that 


V  vh  £  Vh  : 

{Ph(p)(x)  -p{x))  vh(x)dx  =  0.  (2.32) 

Note  that  in  fact  Uh(t  =  0)  =  Ph(uo),  by  (2.8). 

Thus,  by  Lemma  17,  we  have 

Bh(Ph(e),Ph(e))  =  \\\  Ph{e(T))  ||£2(0i1)  +  0T(Ph(e ))  -  i||  P*(e(0))  |||2(0il), 
and  since 

Ph{e( 0))  =  Ph(u o  -  «h(0))  =  Phiuo)  -  Uft(0)  =  0, 

and 


Bh(Ph(e),Ph(e))  =  Bh(Ph(e)  -  e,Ph(e))  =  Bh(Ph{u)  -  u,  Ph{e)), 
by  the  error  equation  (2.31),  we  get 

\\\  Ph(e(T))  ||l2(0il)  +  0T(Ph(e))  =  Bh(Ph(u)  -  u,Ph(e)).  (2.33) 

It  only  remains  to  estimate  the  right-hand  side 

B(Ph{u)  -  u,Ph(e)), 

which,  according  to  our  continuous  model,  should  be  small. 

Estimating  the  right-hand  side.  To  show  that  this  is  so,  we  must  suitably 
treat  the  term  B{Ph{w)  —  w,  Ph(e)).  We  start  with  the  following  remarkable  result. 
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Lemma  18.  We  have 


Bh(Ph{u) 


'Me))  =  -  [J 

Jo 


E  Hp»(u) 

1  <j<N 


■  u)j+i/2(t)[Ph(e)(t)  ]j +i/2  dt. 


Proof  Setting  p  —  Ph(u)  —  u  and  vu  —  Ph{e)  and  recalling  the  definition  of 
Bh( •,  •)>  (2.29),  we  have 

Bh(p,vh)  =  /  /  dtp(x,t)vh(x,t)dxdt 
Jo  Jo 

—  /  E  /  cp(x,t)  dxVh(x,t)  dx  dt 
Jo  1  SASTjJIi 


1  <J<N  j 


fT 

~  E  h(P)j+l/2(t)[vh(t)]j+1/2 
Jo  i<i<N 

=  -(  E  h(P)j+U2(t)[vh{t)l 

JO  1  S  i  S  AT 


dt 


3  + 1/2  dt, 


1  <j<N 


by  the  definition  of  the  L2-projection  (2.32).  This  completes  the  proof. 

Now,  we  can  see  that  a  simple  application  of  Young’s  inequality  and  a  stan¬ 
dard  approximation  result  should  give  us  the  estimate  we  were  looking  for.  The 
approximation  result  we  need  is  the  following. 


Lemma  19.  If  w  6  Hk+1(Ij  Uij+i),  then 

|  h(Ph(w)  -  w){xj+1/2)  |  <  ck  {Ax)k+1/2  ^  |  w  |h*+i(/3.uj3.+i), 
where  the  constant  Ck  depends  solely  on  k. 


Proof.  Dropping  the  argument  Xj+ 1/2  we  have,  by  the  definition  (2.11)  of  the 
flux  h, 

|  h(P(w)  -  w)  I  =  |  |(Ph(w)+  +  Ph(w)~ )  -  ^y-(Ph(w)+  -  Ph{w)-)  -  cw | 

=  |  °  JC^(Pft(w)+  -  w)  +  C+^C\{Ph{wy  -in) | 

<  |  c |  max{  |  Ph{w)+  -  w\,\Ph(w)~  -  w  \  } 

and  the  result  follows  from  the  properties  of  Ph  after  a  simple  application  of  the 
Bramble-Hilbert  lemma;  see  [16].  This  completes  the  proof. 

An  immediate  consequence  of  this  result  is  the  estimate  we  wanted. 


Lemma  20.  We  have 

Bh(Ph(u)  -  u,  Ph(e))  <  cl  (Ax)2k+1  i|i  T  |  «o  \2Hk+Ho,i)  +  \  &r(Ph(e)), 
where  the  constant  Ck  depends  solely  on  k. 
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Proof.  After  using  Young’s  inequality  in  the  right-hand  side  of  Lemma  18,  we 
get 

Bh(Ph(u)-u,Ph(e))<  fT  A-{\HPh(u)-u)j+1/2(t)\2 

J°  l<j<N  lC| 

+  fT  E 

1  <j<N 

By  Lemma  19  and  the  definition  of  the  form  &t,  we  get 

Bh(Ph(u)  -  u,  Ph(e))  <  cl  ( Ax)2k+1  ^  E  I u  lir^i^u/y+x)  +  \  &HPh(e)) 

1  <j<N 

<  cl  ( Ax)2k+1  i|l  T I  no  |^+1(0,1  )i  0T(Pft(e)). 

This  completes  the  proof. 

Conclusion.  Finally,  inserting  in  the  equation  (2.33)  the  estimate  of  its  right 
hand  side  obtained  in  Lemma  20,  we  get 

II  Ph(e(T))  \\2lHoa)  +  &T(Pk(e))  <  ck  (Ax)2k+1  \c\T\u0  \2h„+Ho>1), 

Theorem  2  now  follows  from  the  above  estimate  and  from  the  following  inequality: 

II  e(T)  ||i2(0>1)  <  ||  u(T)  -  Ph(u(T))  ||l2(0i1)  +  ||  Ph(e(T))  ||L*(0>1) 

<  c'k  (Ax)k+l  |uo  |Hfc+l(0,l)  +  II  ph (e(T))  llz,2(0,l)- 


Proof  of  Theorem  3  To  prove  Theorem  3,  we  only  have  to  suitably  modify  the 
proof  of  Theorem  2.  The  modification  consists  in  replacing  the  L2-projection  of  the 
error,  Ph(e),  by  another  projection  that  we  denote  by  Rh{e). 

Given  a  function  p  6  L°°(0, 1)  that  is  continuous  on  each  element  Ij,  we  define 
Rh  (p)  as  the  only  element  of  the  finite  element  space  V),  such  that 


Vj  =  l,...,lV: 

Rh(p)(xj,i)  - p(xj,t )  =  0,  £  =  0,...,k,  (2.34) 


where  the  points  Xj,t  are  the  Gauss-Radau  quadrature  points  of  the  interval  Ij .  We 
take 


xj,k = h+i/2 

l Xj - 


1/2 


if  c  >  0, 
if  c  <  0. 


(2.35) 


The  special  nature  of  the  Gauss-Radau  quadrature  points  is  captured  in  the  fol¬ 
lowing  property: 


v</>  e  Pl{ij),  £<k,  vPeP2k~l{ij) 

j  (Rh{p)(x)  -  p{ x))  p(x)  dx  =  0. 


(2.36) 


Compare  this  equality  with  (2.32). 
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The  quantity  Bh(Rh(e),Rh(e)).  To  prove  our  error  estimate,  we  start  by 
considering  the  quantity  Bh{Rh(e), Rh(e)).  By  Lemma  17,  we  have 

Bh(Rh(e),Rh(e))  =  i||  Rh(e(T))  ||£2(0i1)  +  &r{Rh{e))  -  ||1  ^(e(0))  ||la(0il), 

and  since 

Bh(Rh{e),Rh(e))  -  Bh(Rh(e)  -  e,  Rh(e))  =  Bh(Rh(u)  -  u,Rh(e)), 
by  the  error  equation  (2.31),  we  get 

l\\Rh(e(T))\\lH0A)+&T(Rh(e))  =  \\\Rh(e(0))  ||’2(0il)  +  Bh(Rh(u)  -  u,Rh(e)). 
Next,  we  estimate  the  term  B(Rh(u)  —  u ,  Rh(e)). 

Estimating  B(Rh(u)  —  u,  Rh(e)).  The  following  result  corresponds  to  Lemma 

18. 


Lemma  21.  We  have 


Bh(Rh(u)  -  u,vh)  -  f  f  (Rh(dtu)(x,t)  -  dtu(x,t))vh(x,t)dxdt 
Jo  Jo 

—  I  ^  /  c(Rh(u)(x,t)  —  u(x,t))dx  Vh(x,t)dxdt. 

JO  i  ^  hi  J 


1  <j<N  > 


Proof  Setting  p  =  Rh(u)  —  u  and  Vh  =  Rh{e)  and  recalling  the  definition  of 
Bh{-,  ■),  (2.29),  we  have 


Bh{p,Vh)  —  [  [  dtp(x,t)vh(x,t)dxdt 
Jo  Jo 

—  /  ^2  /  cp(x,t)dxVh{x,t)dxdt 

1  <j<N  ■'B 
rT 

~  '‘H'+i/aW  [vh(t)}j+1/2dt 

JO  1 


\<j<N 

But,  from  the  definition  (2.11)  of  the  flux  h,  we  have 


h(R(u)  -u)=  ^(Rh(u)+  +  Rh(u)~)  -  l-y(Rh( u)+  -  Rh{u)~)  -  cu 


'-{Rh(u)+ -  u)  + 


-(Rh{u)  -u) 


=  0, 

by  (2.35)  and  the  result  follows. 

Next,  we  need  some  approximation  results. 
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Lemma  22.  If  w  €  Hk+2(Ij),  and  Vh  €  Pk(Ij),  then 
I  J  (Rh(w)  -  w)(x)  vh( x)  dx  <  ck  (Ax)k+1 1  w 


and 


/  (Rh(w) -w)(x)dxvh(x)dx  <ck{Ax)k+1\w\Hk+2{Ij) 

I 

where  the  constant  Ck  depends  solely  on  k. 


vh  Ili2(/i)i 


II  vh  II LHlj)i 


Proof.  The  first  inequality  follows  from  the  property  (2.36)  with  I  —  k  and 
from  standard  approximation  results.  The  second  follows  in  a  similar  way  from  the 
property  (2.36)  with  l  =  k  —  1  and  a  standard  scaling  argument.  This  completes 
the  proof. 

An  immediate  consequence  of  this  result  is  the  estimate  we  wanted. 


Lemma  23.  We  have 

fT 

Bh(Rh(u)  -  u,  Rh(e))  <  Ck  (Ax)k+1  |  u0  |j/k+2(0il)  /  ||  Rh(e(t))  ||z,2(0,i) 

Jo 

where  the  constant  Ck  depends  solely  on  k  and  \  c  | . 


dt, 


Conclusion.  Finally,  inserting  in  the  equation  (2.33)  the  estimate  of  its  right 
hand  side  obtained  in  Lemma  23,  we  get 

II  Rh(c{T))  111,2(04)+  &T(Rk(e))  <  || i?h(e(0))  11^2(0, 1} 

rT 

+Ck  {Ax)k+l  |  Uo  |Hfc+2(0,l)  /  II  Rh{c(t))  11^2  (0,1)  dt- 
Jo 

After  applying  a  simple  variation  of  the  Gronwall  lemma,  we  obtain 

II  Rh(e{T))  || £,2 (0,i)  <  II  Rh{e( 0))(x)  ||z.2(0,i)  +  °k  ( Ax)k+1  T  \  u0  |h<=+2(0|1) 

<  c'k{Ax)k+1 1  u0  |tfM-2(0il). 

Theorem  3  now  follows  from  the  above  estimate  and  from  the  following  inequality: 
II  e(T)  ||L2(0i1)  <  ||  «(T)  -  Rh(u(T))  |U2(0j1)  +  ||  Rh(e(T))  ||£a(0il) 

<  Ck  (Ax)k+1  |  Uo  |jyfc+l (0,1)  +  II  Rh[c{T))  1 1^2 (o,l)  - 
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Fig.  2.3.  Comparison  of  the  exact  and  the  approximate  solutions  obtained  with 
M  =  20,  Ax  =  1/40  at  T  =  1/ir  (top)  and  at  T  =  0.40  (bottom):  Exact  solution 
(solid  line),  piecewise  linear  solution  (dotted  line),  and  piecewise  quadratic  solution 
(dashed  line). 
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Fig.  2.4.  Detail  of  previous  figures.  Behavior  of  the  approximate  solutions  four 
elements  around  the  shock  at  T  —  1/tt  (top)  and  at  T  =  0.40  (bottom):  Exact 
solution  (solid  line),  piecewise  linear  solution  (dotted  line),  and  piecewise  quadratic 
solution  (dashed  line). 


Fig.  2.5.  Detail  of  previous  figures.  Behavior  of  the  approximate  solutions  two 
elements  in  front  of  the  shock  at  T  =  l/-n  (top)  and  at  T  =  0.40  (bottom):  Exact 
solution  (solid  line),  piecewise  linear  solution  (dotted  line),  and  piecewise  quadratic 
solution  (dashed  line). 
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Fig.  2.6.  Comparison  of  the  exact  and  the  approximate  solutions  obtained  with 
M  =  20,  Ax  =  1/40  as  the  shock  passes  through  one  element.  Exact  solution 
(solid  line),  piecewise  linear  elements  (dotted  line)  and  piecewise  quadratic  elements 
(dashed  line).  Top:  T  =  0.40,  middle:  T  =  0.45,  and  bottom:  T  =  0.50. 
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3  The  Hamilton- Jacobi  equations  in  one  space 
dimension 

3.1  Introduction 

In  this  chapter,  we  extend  the  RKDG  method  to  the  following  simple  problem  for 
the  Hamilton-Jacobi  equation 

<pt+H(<px)  =  0,  in  (0, 1)  x  (0,T),  (3.1) 

<p(x,0)  =  Vre(0,l),  (3.2) 

where  we  take  periodic  boundary  conditions.  The  material  in  this  section  is  based 
in  the  work  of  Hu  and  Shu  [43]. 


3.2  The  RKDG  method 

The  main  idea  to  extend  the  RKDG  method  to  this  case,  is  to  realize  that  u  =  ipx 


satisfies  the  following  problem: 

ut  +  H(u)x  =  0,  in  (0, 1)  x  (0,  T),  (3.3) 

w(®,0)  =  (iy30)x(a:),  V  x  £  (0,1),  (3.4) 

and  that  <p  can  be  computed  from  u  by  solving  the  following  problem: 

<pt  =  —H(u),  in  (0,1)  x  (0,T),  (3.5) 

ip(x,0)  =  <po(x),  V  x  £  (0, 1).  (3.6) 


A  straightforward  application  of  the  RKDG  method  to  the  equations  (3.3),  (3.4) 
produces  a  piecewise  polynomial  approximation  Uh  to  u  =  ipx .  If  the  approximating 
polynomials  are  taken  to  be  of  degree  ( k  —  1),  it  is  reasonable  to  seek  an  approx¬ 
imation  iph  to  <p  that  is  piecewise  a  polynomial  of  degree  k.  To  obtain  it,  we  can 
discretize  (3.5),  (3.6)  in  one  of  the  following  ways: 


(i)  Take  t)  in  such  that 

V?  =  1, ....  AT,  vh  6  Pk(Ij)  : 

/  dt<Ph{x,t)vh(x)dx  =  -  H(uh{x,t))vh(x)dx, 

J’i  Jh 

J  <p(x,0)vh(x)dx  =  J  (po{x)vh{x)dx. 


(ii)  Take  iph(-,  t)  in  such  that 

Vj  =  1, ...  ,N  :  dxiph{x,t)  =  u(x,t)  Vx  €  Ij. 

This  determines  <fih  up  to  a  constant.  To  find  this  constant,  we  impose  the  following 
condition: 


Vj  =  l,...,iV: 


£ 

dt 


/  tph{x,t)dx  =  —  /  H(uh{x,t))dx, 

Jij  Jij 

J  ip(x,0)dx  =  J  <po(x)dx. 
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(iii)  Pick  one  element,  say  Ij,  and  determine  the  values  of  iph  on  it  by  requiring 
the  following  conditions: 


dxiph(x,t)  =  u(x,t)  Vx  €  Ij, 

^  j  <ph(x,t)dx  =  -  h  H{uh{x,t))dx, 

/  <p{x,(f)dx  =  /  ipo{x)dx. 

Jj.i  Ji, 


Then,  compute  iph  as  follows: 


<fih(x,t)  =  iph{xj,t)  +  u(s,t)ds. 

Jxj 


Note  that,  unlike  the  previous  approaches,  the  approximate  solution  iph  is  now 
continuous. 

The  advantage  of  the  first  two  approaches  is  that  they  can  be  carried  out  in 
parallel.  On  the  other  hand,  in  the  third  approach,  only  a  single  ODE  has  to  be 
solved;  moreover,  the  integration  in  space  takes  place  just  at  the  very  end  of  the 
whole  computation.  This  approach  is  much  more  efficient. 

It  could  be  argued  that  in  the  third  approach,  the  recovered  values  of  tp  depend 
upon  the  choice  of  the  starting  point  x\.  However,  this  difference  is  on  the  level  of 
truncation  errors  and  does  not  affect  the  order  of  accuracy.  Hu  and  Shu  [43]  used 
both  the  second  and  third  approaches;  they  report  that  their  numerical  experience 
is  that,  when  there  are  singularities  in  the  derivatives,  the  second  approach  will 
often  produce  dents  and  bumps  when  the  integral  path  in  time  passes  through 
the  singularities.  This  can  be  avoided  in  the  third  approach.  Indeed,  the  main 
idea  of  using  the  third  approach  is  to  choose  the  element  Ij  so  that  the  time 
integral  paths  do  not  cross  derivative  singularities.  This  cannot  be  always  be  done 
with  a  single  element  Ij,  but  it  is  always  possible  to  switch  to  another  element 
before  the  singularity  in  the  derivative  hits  the  current  element  Ij.  If  the  number  of 
discontinuities  in  the  derivative  is  finite,  this  needs  to  be  done  only  a  finite  number 
of  times.  This  maintains  the  efficiency  of  the  method. 

Note  that  all  the  properties  fo  the  RKDG  method  obtained  in  the  previous 
section  apply  to  the  approximate  solution  Uh ■  In  particular,  a  consequence  of  the 
work  of  Jiang  and  Shu  [51],  is  the  following  result  for  the  approximation  to  the 
derivative  tpx,  Uh',  see  also  Proposition  6  and  Theorem  7. 


Theorem  24.  For  any  of  the  above  methods  and  any  polynomial  degree  k  >  0,  we 
have 


J  pi 

—  J  u\{x,t)dx<  0.  (3.7) 

Moreover,  if  the  Hamiltonian  H  is  a  strictly  convex  or  concave  function,  for  any 
k  >  0,  if  the  numerical  solution  given  by  the  DG  method  converges,  it  converges  to 
the  viscosity  solution. 
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Note  that  the  above  result  trivially  implies  the  TVB  (total  variation  bounded) 
property  for  the  numerical  solution  iph-  Indeed 


rb 

TV(<ph(t))=  /  \uh(x,t)\  dx  <  Vb-a  ||  (p’Q  ||L2(a,6)- 

J  a 


This  is  a  rather  strong  stability  result,  considering  that  it  holds  independently  of  the 
degree  of  the  polynomial  approximation  even  when  the  derivative  of  the  solution 
(fix  develops  discontinuities  and  without  the  application  of  the  generalized  slope 
limiter! 


3.3  Computational  results 

In  this  section,  we  present  the  numerical  experiments  of  Hu  and  Shu  [43]  showing 
the  performance  of  the  RKDG  method.  Our  main  purpose  is  to  asses  the  accuracy 
of  the  method  and  see  if  the  generalized  slope  limiter  needs  to  be  used.  We  display 
the  results  obtained  with  the  third  approach  to  compute  tfih . 

The  first  test  problem.  One  dimensional  Burgers’  equation: 

y,t4>*  +  1)2  =0,  in  (—1, 1)  x  (0,T), 

(fi{x,0)  =  —  cos(7rx),  V  a:  6  (—1,1), 
with  periodic  boundary  conditions. 

The  local  Lax-Friedrichs  flux  is  used.  At  T  =  0.5/7T2,  the  solution  is  still  smooth. 
We  list  the  errors  and  the  numerical  orders  of  accuracy  in  Table  3.1.  We  observe 
that,  except  for  the  P 1  case  which  seems  to  be  only  first  order,  Pk  for  k  >  1  seems 
to  provide  close  to  ( k  4-  l)-th  order  accuracy.  The  meshes  used  are  all  uniform,  and 
errors  are  computed  at  the  middle  point  of  each  interval. 

To  investigate  the  accuracy  problem  further,  we  use  non-uniform  meshes  ob¬ 
tained  by  randomly  shifting  the  cell  boundaries  in  a  uniform  mesh  in  the  range 
[— O.lh,  0.1ft].  In  order  to  avoid  possible  superconvergence  at  cell  centers,  we  also 
give  the  “real”  L 2  error  (computed  by  a  6-point  Gaussian  quadrature  in  each  cell). 
The  results  are  shown  in  Table  3.2. 

At  T  =  3.5/7 r2,  the  solution  has  developed  a  discontinuous  derivative.  In  Fig. 
3.1,  we  show  the  sharp  corner-like  numerical  solution  with  41  elements  obtained 
with  Pk  for  k  =  1,2, 3, 4  with  a  uniform  mesh.  Here  and  below,  the  solid  line  is 
the  exact  solution,  the  circles  are  numerical  solutions  (only  one  point  per  element 
is  drawn). 

The  second  test  problem.  One  dimensional  equation  with  a  non-convex  flux: 

(fit  -  cos  ((fix  +  1)  =  0,  in  (-1, 1)  x  (0,T), 

<p(x,  0)  =  -  cos(ttx),  V  x  g  (-1, 1), 

with  periodic  boundary  conditions. 

The  local  Lax-Friedrichs  flux  and  uniform  meshes  are  used.  At  T  =  0.5/7T2,  the 
solution  is  still  smooth.  The  accuracy  of  the  numerical  solution  is  listed  in  Table 
3.3.  We  observe  similar  accuracy  as  in  the  previous  example. 
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Table  3.1.  Accuracy  for  ID  Burgers  equation  (uniform  mesh),  T  =  0.5/7T2. 


p1 

P2 

pi 

Pq 

mk 

L1  error 

2233 

LL  error 

Ll  error 

L 1  error 

ESI 

0.17E+00 

— 

0.14E-02 

— 

0.21E-03 

— 

0.57E-05 

— 

mm 

1.12 

0.18E-03 

2.92 

0.13E-04 

3.94 

0.73E-06 

2.97 

EE 

0.35E-01 

1.16 

2.97 

0.75E-06 

4.17 

0.32E-07 

4.52 

KTil 

1.12 

0.28E-05 

Kliia 

0.43E-07 

4.12 

0.12E-08 

4.79 

1.02 

3.19 

0.25E-08 

4.10 

0.48E-10 

4.59 

pl 

P 2 

pi 

P3 

wa 

L°°  error 

323 

L°°  error 

22a 

L00  error 

Tm 

0.29E+00 

— 

0.24E-02 

— 

0.69E-03 

— 

0.13E-04 

— 

mm 

0.13E+00 

1.13 

0.33E-03 

TOES! 

0.61E-04 

3.51 

0.16E-05 

2.99 

ESI 

1.15 

0.37E-04 

0.27E-01 

1.11 

2.97 

3.93 

0.59E-08 

4.44 

0.13E-01 

KSiIO 

0.23E-07 

4.07 

0.25E-09 

4.57 

Table  3.2.  Accuracy  for  ID  Burgers  equation  (non-uniform  mesh),  T  =  0.5/7T2. 


pl 

pi 

pi 

P4  \ 

m 

Lz  error 

22SS 

L 2  error 

22J3 

Lz  error 

L 2  error 

IE 

0.74E+00 

— 

— 

0.32E-03 

— 

0.53E-04 

— 

mm 

0.34E+00 

i.ii 

0.51E-03 

2.76 

0.24E-04 

KWH 

4.71 

EE 

0.15E+00 

warn 

0.17E-05 

4.84 

Eil 

0.67E-01 

1.17 

0.90E-05 

2.86 

0.13E-06 

3.72 

0.20E-08 

5.15 

0.31E-01 

1.13 

0.74E-10 

4.76 

pl 

j  p2 

— p— 

P4  \ 

N 

L1  error 

I32S 

LL  error 

order 

LA  error 

order 

Ll  error 

order 

10 

IfflEBiggiTil 

II  E 

— 

0.23E-03 

— 

0.30E-05 

— 

20 

0.24E+00 

0.14E-04 

2.89 

40 

0.11E+00 

1.19 

0.26E-04 

0.16E-07 

4.65 

80 

0.47E-01 

1.17 

0.37E-05 

2.82 

gyzni 

160 

0.21E-01 

1.13 

0.41E-06 

3.16 

0.27E-08 

4.56 

1 

_ P— _ 1 

_ P2 

pi 

p4 

m 

L°°  error 

L°°  error 

^2a 

L°°  error 

i32a 

L°°  error 

EE 

0.62E+00 

— 

0.36E-02 

— 

a 

0.11E-04 

— 

EE 

1.11 

0.47E-03 

2.94 

0.61E-04 

3.52 

0.16E-05 

2.81 

IE 

1.16 

0.67E-04 

CBS 

3.64 

M 

0.58E-01 

0.62E-06 

2.91 

0.59E-08 

4.45 

EE2 

0.27E-01 

0.19E-05 

3.11 

0.31E-07 

iimsfcVDgiV 

4.17 
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Fig.  3.1.  One-dimensional  Burgers’  equation,  T  =  3.5/7 r2. 


At  T  =  1.5/ 7r2 ,  the  solution  has  developed  corner-like  discontinuity  in  the 
derivative.  The  numerical  result  with  41  elements  is  shown  in  Fig.  3.2. 

The  third  test  problem.  Riemann  problem  for  the  one  dimensional  equation 
with  a  non-convex  flux: 

<Pt  +  ~  1  ){vl  -  4)  =  0,  in  (-1, 1)  x  (0,T), 

<*>(*, 0)  =  -2|x|,  Vi6(-l,l), 

For  this  test  problem,  the  use  of  the  generalized  slope  limiter  proved  to  be 
essential  since  otherwise  the  approximate  solution  does  not  converge  to  the  viscosity 
solution;  this  is  the  only  example  in  which  we  use  the  nonlinear  limiting.  We  remark 
that  for  the  finite  difference  schemes,  such  nonlinear  limiting  or  the  adaptive  stencil 
in  ENO  is  needed  in  most  cases  in  order  to  enforce  stability  and  to  obtain  non- 
oscillatory  results. 

Numerical  results  at  T  =  1  with  81  elements,  using  the  local  Lax-Friedrichs 
flux,  is  shown  in  Fig.  3.3.  The  results  of  using  the  Godunov  flux  is  shown  in  Fig. 
3.4.  We  can  see  that  while  for  P1,  the  results  of  using  two  different  monotone  fluxes 
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Table  3.3.  Accuracy  for  ID  non-convex,  H(u)  =  —  cos(u  +  1),  T  —  0.5/7T2. 


r 

p1 

P2 

— p3  | 

P4 

m 

L 1  error 

order 

LL  error 

Ll  error 

order 

EE 

— 

0.10E-02 

— 

0.34E-03 

— 

0.24E-04 

— 

m 

0.36E-01 

1.23 

0.15E-03 

3.49 

0.13E-05 

4.28 

EE 

0.15E-01 

0.15E-05 

4.33 

0.59E-07 

4.42 

m 

0.68E-02 

2.97 

0.94E-07 

|Pjj 

0.21E-08 

4.78 

rr 

P1 

P2 

P3 

j  p4 

m 

L°°  error 

L°°  error 

L°°  error 

order 

L°°  error 

order 

EDI 

0.18E+00 

— 

0.15E-02 

— 

0.11E-02 

— 

0.99E-04 

— 

m\ 

0.73E-01 

1.31 

0.27E-03 

2.43 

0.22E-03 

2.35 

0.13E-04 

2.95 

m 

0.31E-01 

1.24 

0.47E-04 

ISA 

0.18E-04 

3.63 

0.59E-06 

4.44 

□ 

0.14E-01 

1.16 

0.85E-05 

0.14E-05 

3.75 

0.26E-07 

4.49 

are  significantly  different  in  resolution,  this  difference  is  greatly  reduced  for  higher 
order  of  accuracy.  In  most  of  the  high  order  cases,  the  simple  local  Lax-Friedrichs 
flux  is  a  very  good  choice. 


3.4  Concluding  Remarks. 

In  this  section,  we  have  extended  the  RKDG  method,  originally  devised  for  nonlin¬ 
ear  conservation  laws,  to  the  Hamilton-Jacobi  equations.  The  extension  was  carried 
out  by  exploiting  the  fact  that  the  derivative  of  the  solution  of  the  Hamilton-Jacobi 
equation  satisfies  a  nonlinear  conservation  law. 

The  numerical  experiments  show  that  when  polynomials  of  degree  k  are  used, 
the  method  is  of  order  ( k  +  1)  in  L2,  except  when  k  =  1;  this  phenomenon  remains 
to  be  explained.  Also,  we  have  seen  that  the  use  of  slope  limiters  was  only  needed 
in  the  third  test  problem-  otherwise  the  convergence  to  the  viscosity  solution  did 
not  take  place. 

The  scheme  can  be  extended  to  the  case  of  a  bounded  domain  in  a  very  simple 
way.  The  extension  of  the  scheme  to  the  multidimensional  case  is  not  quite  straight¬ 
forward  and  will  be  carried  out  after  we  study  how  to  define  the  RKDG  method 
for  multidimensional  conservation  laws. 


Fig.  3.3.  One  dimension  Riemann  problem,  local  Lax- Friedrichs  flux,  H(u) 
i(n2-l)(«2-4),T  =  l. 


Fig.  3.4.  One  dimension  Riemann  problem,  Godunov  flux,  H(u )  =  |(w2  —  l)(u2 
4),T=1. 
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4  The  RKDG  method  for  multi-dimensional  systems 

4.1  Introduction 

In  this  section,  we  extend  the  RKDG  methods  to  multidimensional  systems: 

ut  +  V/(u)=0,  inf2x(0,T),  (4.1) 

w(a:,0)  =  uo(x),  V  x  6  ft,  (4.2) 

and  periodic  boundary  conditions.  For  simplicity,  we  assume  that  Q  is  the  unit 
cube. 

This  section  is  essentially  devoted  to  the  description  of  the  algorithms  and 
their  implementation  details.  The  practitioner  should  be  able  to  find  here  all  the 
necessary  information  to  completely  code  the  RKDG  methods. 

This  section  also  contains  two  sets  of  numerical  results  for  the  Euler  equations 
of  gas  dynamics  in  two  space  dimensions.  The  first  set  is  devoted  to  transient  com¬ 
putations  and  domains  that  have  corners;  the  effect  of  using  triangles  or  rectangles 
and  the  effect  of  using  polynomials  of  degree  one  or  two  are  explored.  The  main 
conclusions  from  these  computations  are  that  (i)  the  RKDG  method  works  as  well 
with  triangles  as  it  does  with  rectangles  and  that  (ii)  the  use  of  high-order  polyno¬ 
mials  does  not  deteriorate  the  approximation  of  strong  shocks  and  is  advantageous 
in  the  approximation  of  contact  discontinuities. 

The  second  set  concerns  steady  state  computations  with  smooth  solutions.  For 
these  computations,  no  generalized  slope  limiter  is  needed.  The  effect  of  (i)  the 
quality  of  the  approximation  of  curved  boundaries  and  of  (ii)  the  degree  of  the 
polynomials  on  the  quality  of  the  approximate  solution  is  explored.  The  main  con¬ 
clusions  from  these  computations  are  that  (i)  a  high-order  approximation  of  the 
curve  boundaries  introduces  a  dramatic  improvement  on  the  quality  of  the  solu¬ 
tion  and  that  (ii)  the  use  of  high-degree  polynomials  is  advantageous  when  smooth 
solutions  are  sought. 

This  section  contains  material  from  the  papers  [21],  [20],  and  [28].  It  also  con¬ 
tains  numerical  results  from  the  paper  by  Bassi  and  Rebay  [4]  in  two  dimensions 
and  from  the  paper  by  Warburton,  Lomtev,  Kirby  and  Karniadakis  [91]  in  three 
dimensions. 

4.2  The  general  RKDG  method 

The  RKDG  method  for  multidimensional  systems  has  the  same  structure  it  has  for 
one-dimensional  scalar  conservation  laws,  that  is, 

-  Set  u°h  =  Allh  Pvh(u o); 

—  For  n  =  0, ...,  IV  —  1  compute  u£+1  as  follows: 

1.  set  u(h0)  = 

2.  for  *  =  1, ...,  k  +  1  compute  the  intermediate  functions: 

i4‘)  =  AIIh  l^auu^  +  PuAtnLh(v%) 

i.  1=0 

3.  set  ul+l  =  u[k+1). 

In  what  follows,  we  describe  the  operator  Lh  that  results  form  the  DG-space 
discretization,  and  the  generalized  slope  limiter  AIIh- 


>  ; 
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The  Discontinuous  Galerkin  space  discretization  To  show  how  to  dis¬ 
cretize  in  space  by  the  DG  method,  it  is  enough  to  consider  the  case  in  which  it  is 
a  scalar  quantity  since  to  deal  with  the  general  case  in  which  u,  we  apply  the  same 
procedure  component  by  component. 

Once  a  triangulation  Th  of  Q  has  been  obtained,  we  determine  Lh(-)  as  follows. 
First,  we  multiply  (4.1)  by  vu  in  the  finite  element  space  14,  integrate  over  the 
element  K  of  the  triangulation  Th  and  replace  the  exact  solution  u  by  its  approxi¬ 
mation  Uh  6  14: 


£ 

dt 


/  uh(t,x)vh(x)dx  +  /  div f(uh(t,x))vk(x)dx  =  0,  Vvh  €Vh.  (4.3) 
J  K  Jk 


Integrating  by  parts  formally  we  obtain 

4  [  Uh(t,x)vh(x)dx  +  V]  [  f{uh{t,x))-netKVh(x)dr 

MJk  e€dKJe 

-  f(uh(t,x))-S7vh(x)dx  =  0,  Vvh€Vh, 

Jk 

where  ne,K  is  the  outward  unit  normal  to  the  edge  e.  Notice  that  f(uh{t,x))  ■ 
ne,K  does  not  have  a  precise  meaning,  for  uu  is  discontinuous  at  x  G  e  6  dK. 
Thus,  as  in  the  one  dimensional  case,  we  replace  f(uh(t,  x))  ■  ne,K  by  the  function 
he,K(uh(t,  x'nt^),uh(t,  xext<-K^)).  The  function  fte,K  (•,  •)  is  any  consistent  two- 
point  monotone  Lipschitz  flux,  consistent  with  f{u)  ■  ne<K- 
In  this  way  we  obtain 


4  [  Uh(t,x)vh(x)dx+  V  [ he,K{t,x)  Vh(x)dT 
MJk  e€9KJ* 

-[  f(uh(t,x))-'Vvh(x)dx  =  0,  Vvh€Vh. 

Jk 


Finally,  we  replace  the  integrals  by  quadrature  rules  that  we  shall  choose  as  follows: 


/  he,K{t,x)vh{x)dr  as  he<K{t,xei)v{xei) |e|, 

Je  i=i 

r.  M 

/  f{uh{t,x))  ■  Vvh{x)dx  as  V'wj  f(uh{t,xKj))  ■  Vvh(xKj)\K\. 

Jk  i=i 

Thus,  we  finally  obtain,  for  each  element  K  £  Th,  the  weak  formulation: 


(4.4) 

(4.5) 


£ 

dt 


/  Uh{t,x)vh(x)dx  +  Y]  he,K{t,Xel)v(xel)\e\ 

J  K  f—f 


e€9K  i=l 


f(Uh(t,XKj))  ■  Vvh(xKj)\K\  =  0,  Mvh  £  14- 


These  equations  can  be  rewritten  in  ODE  form  as  ^ uu  —  Lh{uh,^h)-  This 
defines  the  operator  Lh{uh),  which  is  a  discrete  approximation  of  —divf(u).  The 
following  result  gives  an  indication  of  the  quality  of  this  approximation. 
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Proposition  25.  Let  f(u)  6  Wk+2’°°  (f2) ,  and  setj  =  trace(u).  Let  the  quadrature 
rule  over  the  edges  be  exact  for  polynomials  of  degree  (2fc  +  l),  and  let  the  one  over 
the  element  be  exact  for  polynomials  of  degree  (2k).  Assume  that  the  family  of 
triangulations  T  —  {Th}h>o  is  regular,  i.e.,  that  there  is  a  constant  a  such  that: 

^><r,  VKeTk,  VTh€P,  (4.6) 

where  Hk  is  the  diameter  of  K,  and  pk  is  the  diameter  of  the  biggest  ball  included 
inK.  Then,  if  V (K)  D  Pk(K) ,  V  K  e%: 

\\Lh(u,o/)  +  div  f(u)\\L~>(n)  <  C  hfc+1|/(u)lw<=+2.~(n)- 


For  a  proof,  see  [20]. 


The  form  of  the  generalized  slope  limiter  Allh.  The  construction  of 
generalized  slope  limiters  Allh  for  several  space  dimensions  is  not  a  trivial  matter 
and  will  not  be  discussed  in  these  notes;  we  refer  the  interested  reader  to  the  paper 
by  Cockburn,  Hou,  and  Shu  [20]. 

In  these  notes,  we  restrict  ourselves  to  displaying  very  simple,  practical,  and 
effective  generalized  slope  limiters  Allh  which  axe  closely  related  to  the  generalized 
slope  limiters  AIIk  of  the  previous  section. 

To  compute  AllhUh,  we  rely  on  the  assumption  that  spurious  oscillations  are 
present  in  Uh  only  if  they  are  present  in  its  P1  part  u\ ,  which  is  its  ^-projection 
into  the  space  of  piecewise  linear  functions  I41.  Thus,  if  they  are  not  present  in  ul, 
i.e.,  if 

ul  =  Allh  ul, 

then  we  assume  that  they  are  not  present  in  uh  and  hence  do  not  do  any  limiting: 

Allh  Uh  =  uh- 

On  the  other  hand,  if  spurious  oscillations  are  present  in  the  P1  part  of  the  solution 
ul,  i.e.,  if 

ul  ^  Allh  ul, 

then  we  chop  off  the  higher  order  part  of  the  numerical  solution,  and  limit  the 
remaining  Pl  part: 

AllhUh  =  Allh  ul- 

In  this  way,  in  order  to  define  AJJh  for  arbitrary  space  14 ,  we  only  need  to  actually 
define  it  for  piecewise  linear  functions  Vj,1.  The  exact  way  to  do  that,  both  for  the 
triangular  elements  and  for  the  rectangular  elements,  will  be  discussed  in  the  next 
section. 


4.3  Algorithm  and  implementation  details 

In  this  section  we  give  the  algorithm  and  implementation  details,  including  numer¬ 
ical  fluxes,  quadrature  rules,  degrees  of  freedom,  fluxes,  and  limiters  of  the  RKDG 
method  for  both  piecewise-linear  and  piecewise-quadratic  approximations  in  both 
triangular  and  rectangular  elements. 
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Fluxes  The  numerical  flux  we  use  is  the  simple  Lax- Friedrichs  flux: 

he,K{a,b)  =  i  [f(a)  •  ne,K  +  f (6)  •  ne,K  -  ae,K  (b  -  a)] . 

The  numerical  viscosity  constant  ae,K  should  be  an  estimate  of  the  biggest  eigen¬ 
value  of  the  Jacobian  (uh(x,t))  ■  ne,K  for  (x,  t)  in  a  neighborhood  of  the  edge 
e. 

For  the  triangular  elements,  we  use  the  local  Lax- Friedrichs  recipe: 

—  Take  ae,K  to  be  the  larger  one  of  the  largest  eigenvalue  (in  absolute  value)  of 
£-{(uk)  ■  Tie, k  and  that  of  -^f{uKi)  ■ ne,K ,  where  uk  and  uKi  are  the  means 
of  the  numerical  solution  in  the  elements  K  and  K'  sharing  the  edge  e. 

For  the  rectangular  elements,  we  use  the  local  Lax-Friedrichs  recipe  : 

-  Take  ae,K  to  be  the  largest  of  the  largest  eigenvalue  (in  absolute  value)  of 

i{uK")-ne,K ,  where  uKn  is  the  mean  of  the  numerical  solution  in  the  element 
K" ,  which  runs  over  all  elements  on  the  same  line  (horizontally  or  vertically, 
depending  on  the  direction  of  ne,K )  with  K  and  K'  sharing  the  edge  e. 

Quadrature  rules  According  to  the  analysis  done  in  [20],  the  quadrature  rules 
for  the  edges  of  the  elements,  (4.4),  must  be  exact  for  polynomials  of  degree  2k  + 1, 
and  the  quadrature  rules  for  the  interior  of  the  elements,  (4.5),  must  be  exact  for 
polynomials  of  degree  2k,  if  Pk  methods  are  used.  Here  we  discuss  the  quadrature 
points  used  for  P 1  and  P2  in  the  triangular  and  rectangular  element  cases. 

The  rectangular  elements  For  the  edge  integral,  we  use  the  following  two 
point  Gaussian  rule 

£  *»)*»»  (--)!)+»  (-)=),  (4.1) 

for  the  P 1  case,  and  the  following  three  point  Gaussian  rule 

/_,  ^x)dx « I  K_l) +p(l)]  ■‘■I 9{o)  ’  (4-2) 

for  the  P2  case,  suitably  scaled  to  the  relevant  intervals. 

For  the  interior  of  the  elements,  we  could  use  a  tensor  product  of  (4.1),  with 
four  quadrature  points,  for  the  P 1  case.  But  to  save  cost,  we  “recycle"  the  values 
of  the  fluxes  at  the  element  boundaries,  and  only  add  one  new  quadrature  point  in 
the  middle  of  the  element.  Thus,  to  approximate  the  integral  f_1  f_1  g{x,y)dxdy , 
we  use  the  following  quadrature  rule: 

+J(_75’_1)  +s(t5'_1) 

+9  (1,-7l)  +  ,(1’  7j) 

+2  9(0,0). 


(4.3) 
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For  the  P2  case,  we  use  a  tensor  product  of  (4.2),  with  9  quadrature  points. 


The  triangular  elements  For  the  edge  integral,  we  use  the  same  two  point  or 
three  point  Gaussian  quadratures  as  in  the  rectangular  case,  (4.1)  and  (4.2),  for 
the  P1  and  P2  cases,  respectively. 

For  the  interior  integrals  (4.5),  we  use  the  three  mid-point  rule 


J^g(x,y)dxdy  »  ^^s(»ii), 


where  rrii  are  the  mid-points  of  the  edges,  for  the  P1  case.  For  the  P2  case,  we 
use  a  seven-point  quadrature  rule  which  is  exact  for  polynomials  of  degree  5  over 
triangles. 


Basis  and  degrees  of  freedom  We  emphasize  that  the  choice  of  basis  and 
degrees  of  freedom  does  not  affect  the  algorithm,  as  it  is  completely  determined  by 
the  choice  of  function  space  V (h)  ,  the  numerical  fluxes,  the  quadrature  rules,  the 
slope  limiting,  and  the  time  discretization.  However,  a  suitable  choice  of  basis  and 
degrees  of  freedom  may  simplify  the  implementation  and  calculation. 


The  rectangular  elements  For  the  P1  case,  we  use  the  following  expression  for 
the  approximate  solution  Uh(x,y,t)  inside  the  rectangular  element  [x(_i,xi+  j]  * 

[yj-h>Vj+ jl: 


uh(x,y,t)  =  u(t)  +  ux{t)<j>i{x)  +  uv(t)ip,(y) 

where 


4>i(X) 


X  —  Xi 
Axi/2  ’ 


(y) 


y-Vj 

Ay ,12' 


and 

Axi  =  xi+i  -  x,_  i ,  Ay,  =  y,+ 

The  degrees  of  freedoms,  to  be  evolved  in  time,  are  then 


(4.4) 


(4.5) 


u{t),  UX(t),  U  y(t). 


Here  we  have  omitted  the  subscripts  ij  these  degrees  of  freedom  should  have,  to 
indicate  that  they  belong  to  the  element  ij  which  is  [xi_i,xi+i]  x  [y,_L,y,+  i]- 
Notice  that  the  basis  functions 


1,  <pi  (x),  ipj(y), 


are  orthogonal,  hence  the  local  mass  matrix  is  diagonal: 
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For  the  P 2  case,  the  expression  for  the  approximate  solution  Uh(x,y,t)  inside 
the  rectangular  element  x  [y^i  ,yJ+ 1]  is: 


Uh{x,y,t)  =  u(t)  +  ux(t)<f>i(x)  +uy(t)ipj(y) 
+Uxy(t)cl>i(x)lpj  (y) 


+uxx{t)  ^<j>j{x)  -  0 
+uyy(t)  (j)j{y)  -  , 


(4.6) 


where  4>i{x)  and  ipj  (y)  are  defined  by  (4.5).  The  degrees  of  freedoms,  to  be  evolved 
in  time,  are 

u(£),  ria,(£),  tiy(£),  Wiy(£),  u.  XX  (£),  Uyy  (£). 

Again  the  basis  functions 

1,  <t>i(x),  i/jjiy),  4>i{x)ipj(y),  </>?(x)-i,  rpj(y)-^, 


are  orthogonal,  hence  the  local  mass  matrix  is  diagonal: 


M  =  Ax,  Ay^  diag 


I  I  1  ±  ±\ 

3’  3’  9’  45’  45 /  ' 


The  triangular  elements  For  the  P 1  case,  we  use  the  following  expression  for 
the  approximate  solution  Uh(x,y,t)  inside  the  triangle  K: 

3 

uh{x, y,t)  -  Y^Ui{i)^i{x,y) 

i= 1 

where  the  degrees  of  freedom  «,(£)  are  values  of  the  numerical  solution  at  the 
midpoints  of  edges,  and  the  basis  function  tpi(x,y)  is  the  linear  function  which 
takes  the  value  1  at  the  mid-point  mi  of  the  i-th  edge,  and  the  value  0  at  the 
mid-points  of  the  two  other  edges.  The  mass  matrix  is  diagonal 

«  =  i). 

For  the  P2  case,  we  use  the  following  expression  for  the  approximate  solution 
Uh(x,y,t)  inside  the  triangle  K: 

6 

uh{x,y,t)  =  ^Ui(£)G(^>y) 

1  =  1 

where  the  degrees  of  freedom,  u;(t),  are  values  of  the  numerical  solution  at  the 
three  midpoints  of  edges  and  the  three  vertices.  The  basis  function  £i(x,y),  is  the 
quadratic  function  which  takes  the  value  1  at  the  point  i  of  the  six  points  mentioned 
above  (the  three  midpoints  of  edges  and  the  three  vertices),  and  the  value  0  at  the 
remaining  five  points.  The  mass  matrix  this  time  is  not  diagonal. 
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Limiting  We  construct  slope  limiting  operators  Allh  on  piecewise  linear  functions 
uu  in  such  a  way  that  the  following  properties  are  satisfied: 

1.  Accuracy:  if  Uh  is  linear  then  Alik  —  Uh- 

2.  Conservation  of  mass:  for  every  element  K  of  the  triangulation  Th,  we  have: 

/  Allh  Uh.=  uh- 
Jk  J  k 

3.  Slope  limiting:  on  each  element  K  of  Th,  the  gradient  of  Allh  Uh  is  not  bigger 
than  that  of  Uh- 

The  actual  form  of  the  slope  limiting  operators  is  closely  related  to  that  of  the 
slope  limiting  operators  studied  in  [24]  and  [20]. 


The  rectangular  elements  The  limiting  is  performed  on  ux  and  uy  in  (4.4), 
using  the  differences  of  the  means.  For  a  scalar  equation,  ux  would  be  limited 
(replaced)  by 


m  (Ux ,  Ui  k  1  ,j  Uij, Uij  Ui  —  l'j')  (4.7) 

where  the  function  m  is  the  TVB  corrected  minmod  function  defined  in  the  previous 
section. 

The  TVB  correction  is  needed  to  avoid  unnecessary  limiting  near  smooth  ex¬ 
trema,  where  the  quantity  ux  or  uy  is  on  the  order  of  C)(Ax2)  or  0(Ay2).  For  an 
estimate  of  the  TVB  constant  M  in  terms  of  the  second  derivatives  of  the  function, 
see  [24].  Usually,  the  numerical  results  are  not  sensitive  to  the  choice  of  M  in  a 
large  range.  In  all  the  calculations  in  this  paper  we  take  M  to  be  50. 

Similarly,  uy  is  limited  (replaced)  by 

m(uy ,  UiJ+l  Uij  ,  Uij  UiJ—l'). 

with  a  change  of  Ax  to  Ay  in  (4.7). 

For  systems,  we  perform  the  limiting  in  the  local  characteristic  variables.  To 
limit  the  vector  ux  in  the  element  ij ,  we  proceed  as  follows: 

—  Find  the  matrix  R  and  its  inverse  it-1,  which  diagonalize  the  Jacobian  evalu¬ 
ated  at  the  mean  in  the  element  ij  in  the  x-direction: 

R-idfipj)R  =  A 
ou 

where  A  is  a  diagonal  matrix  containing  the  eigenvalues  of  the  Jacobian.  Notice 
that  the  columns  of  R  are  the  right  eigenvectors  of  ^  and  the  rows  of 

R-1  are  the  left  eigenvectors. 

—  Transform  all  quantities  needed  for  limiting,  i.e. ,  the  three  vectors  uxij,  Ui+i,j  — 

Uij  and  Uij  —  to  the  characteristic  fields.  This  is  achieved  by  left  multi¬ 

plying  these  three  vectors  by  fZ-1. 

—  Apply  the  scalar  limiter  (4.7)  to  each  of  the  components  of  the  transformed 
vectors. 

—  The  result  is  transformed  back  to  the  original  space  by  left  multiplying  R  on 
the  left. 
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The  triangular  elements  To  construct  the  slope  limiting  operators  for  trian¬ 
gular  elements,  we  proceed  as  follows.  We  start  by  making  a  simple  observation. 
Consider  the  triangles  in  Figure  4.1,  where  m  1  is  the  mid-point  of  the  edge  on  the 
boundary  of  Ko  and  b,  denotes  the  barycenter  of  the  triangle  K,  for  *  =  0,1,  2,  3. 


Fig.  4.1.  Illustration  of  limiting. 


Since  we  have  that 


m\  —  bo  =  «i  (hi  -  bo)  +  c*2  (62  —  ho), 

for  some  nonnegative  coefficients  ai,  0:2  which  depend  only  on  mi  and  the  geometry, 
we  can  write,  for  any  linear  function  Uh, 

Uh{mi)  -  Uh{bo)  =  Qi  (uh(bi)  -  Uh{bo))  +  Q2  (wh(&2)  -  Uh(b0)), 

and  since 

UKi  =  7TTT  /  uh=uh{bi),  i  =  0,1, 2, 3, 

\Ki\  JKi 


uh(mi,K0)  =  uh(mi)  -  uk0 

=  ai  {UKi  —  UK0)  +  «2  {uk2  —  UKq) 

=  Au(mi,K0). 


we  have  that 
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Now,  we  are  ready  to  describe  the  slope  limiting.  Let  us  consider  a  piecewise  linear 
function  Uh,  and  let  ml)  i  =  1,2,3  be  the  three  mid-points  of  the  edges  of  the 
triangle  Kq.  We  then  can  write,  for  ( x,y )  €  Ko, 

3 

Uh(x,y)  =  y ]uh(mj)(pi(x,y) 

i- 1 

3 

=  UKo  +  y  Uhjrrii,  Ko)(pi(xy  y). 

i=l 

To  compute  AllhUh,  we  first  compute  the  quantities 

Ai  —  m(uh(mi,K0),v  Au{m,i,K0)), 


where  m  is  the  TVB  modified  minmod  function  and  i>  >  1.  We  take  v  =  1.5  in  our 
numerical  runs.  Then,  if  ^?=1  A,  =  0,  we  simply  set 


3 

AIlhuh(x,y)  —  uKq  +^2  A  ipi(x,y). 

i= 1 


If  52f=i  /  0,  we  compute 


3 

pos  =  ^  max(0,  Ai), 

i=l 


3 

neg  =  ^  max(0,  —  Ai), 

i=  1 


and  set 


Then,  we  define 

3 

AIIhUh{x,y)  =  uk0  +y^iAi(pi(x,y), 

<=i 


where 

Ai  =  G+  max(0,  Ai)  —  9~  max(0,  —  A,). 


It  is  very  easy  to  see  that  this  slope  limiting  operator  satisfies  the  three  properties 
listed  above. 

For  systems,  we  perform  the  limiting  in  the  local  characteristic  variables.  To 
limit  Ai,  we  proceed  as  in  the  rectangular  case,  the  only  difference  being  that  we 
work  with  the  following  Jacobian 


^f(UK0) 


mi  —  bp 
| mi  -  60| 


4.4  Computational  results:  Transient,  nonsmooth  solutions 

In  this  section  we  present  several  numerical  results  obtained  with  the  P 1  and  P2 
(second  and  third  order  accurate)  RKDG  methods  with  either  rectangles  or  trian¬ 
gles  in  the  triangulation.  These  are  standard  test  problems  for  Euler  equations  of 
compressible  gas  dynamics. 
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The  double-Mach  reflection  problem  Double  Mach  reflection  of  a  strong 
shock.  This  problem  was  studied  extensively  in  Woodward  and  Colella  [92]  and 
later  by  many  others.  We  use  exactly  the  same  setup  as  in  [92],  namely  a  Mach  10 
shock  initially  makes  a  60°  angle  with  a  reflecting  wall.  The  undisturbed  air  ahead 
of  the  shock  has  a  density  of  1.4  and  a  pressure  of  1. 

For  the  rectangle  based  triangulation,  we  use  a  rectangular  computational  do¬ 
main  [0, 4]  x  [0, 1],  as  in  [92],  The  reflecting  wall  lies  at  the  bottom  of  the  computa¬ 
tional  domain  for  g  <  x  <  4.  Initially  a  right-moving  Mach  10  shock  is  positioned 
at  x  =  \,y  =  0  and  makes  a  60°  angle  with  the  x-axis.  For  the  bottom  boundary, 
the  exact  post-shock  condition  is  imposed  for  the  part  from  x  =  0  to  x  —  |,  to 
mimic  an  angled  wedge.  Reflective  boundary  condition  is  used  for  the  rest.  At  the 
top  boundary  of  our  computational  domain,  the  flow  values  are  set  to  describe  the 
exact  motion  of  the  Mach  10  shock.  Inflow/outfiow  boundary  conditions  are  used 
for  the  left  and  right  boundaries.  As  in  [92],  only  the  results  in  [0,3]  x  [0,1]  are 
displayed. 

For  the  triangle  based  triangulation,  we  have  the  freedom  to  treat  irregular 
domains  and  thus  use  a  true  wedged  computational  domain.  Reflective  boundary 
conditions  are  then  used  for  all  the  bottom  boundary,  including  the  sloped  portion. 
Other  boundary  conditions  are  the  same  as  in  the  rectangle  case. 

Uniform  rectangles  are  used  in  the  rectangle  based  triangulations.  Four  different 
meshes  are  used:  240  x  60  rectangles  (Ax  —  Ay  =  ^);  480  x  120  rectangles  (Ax  = 
Ay  =  960  x  240  rectangles  (Ax  =  Ay  =  ^);  and  1920  x  480  rectangles 

(Ax  =  Ay  —  The  density  is  plotted  in  Figure  4.2  for  the  P 1  case  and  in  4.3 
for  the  P 2  case. 

To  better  appreciate  the  difference  between  the  P 1  and  P2  results  in  these 
pictures,  we  show  a  “blowed  up”  portion  around  the  double  Mach  region  in  Figure 
4.4  and  show  one-dimensional  cuts  along  the  line  y  —  0.4  in  Figures  4.5  and  4.6. 
In  Figure  4.4,  w  can  see  that  P2  with  Ax  =  Ay  =  ^  has  qualitatively  the  same 
resolution  as  P 1  with  Ax  =  Ay  =  ,  for  the  fine  details  of  the  complicated 

structure  in  this  region.  P2  with  Ax  =  Ay  =  ^  gives  a  much  better  resolution 
for  these  structures  than  Pl  with  the  same  number  of  rectangles. 

Moreover,  from  Figure  4.5,  we  clearly  see  that  the  difference  between  the  results 
obtained  by  using  P1  and  P2,  on  the  same  mesh,  increases  dramatically  as  the  mesh 
size  decreases.  This  indicates  that  the  use  of  polynomials  of  high  degree  might  be 
beneficial  for  capturing  the  above  mentioned  structures.  From  Figure  4.6,  we  see 
that  the  results  obtained  with  P1  are  qualitatively  similar  to  those  obtained  with  P2 
in  a  coarser  mesh;  the  similarity  increases  as  the  meshsize  decreases.  The  conclusion 
here  is  that,  if  one  is  interested  in  the  above  mentioned  fine  structures,  then  one  can 
use  the  third  order  scheme  P2  with  only  half  of  the  mesh  points  in  each  direction 
as  in  P1 .  This  translates  into  a  reduction  of  a  factor  of  8  in  space-time  grid  points 
for  2D  time  dependent  problems,  and  will  more  than  off-set  the  increase  of  cost 
per  mesh  point  and  the  smaller  CFL  number  by  using  the  higher  order  P2  method. 
This  saving  will  be  even  more  significant  for  3D. 

The  optimal  strategy,  of  course,  is  to  use  adaptivity  and  concentrate  triangles 
around  the  interesting  region,  and/or  change  the  order  of  the  scheme  in  different 
regions. 
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The  forward-facing  step  problem  Flow  past  a  forward  facing  step.  This 
problem  was  again  studied  extensively  in  Woodward  and  Colella  [92]  and  later  by 
many  others.  The  set  up  of  the  problem  is  the  following:  A  right  going  Mach  3  uni¬ 
form  flow  enters  a  wind  tunnel  of  1  unit  wide  and  3  units  long.  The  step  is  0.2  units 
high  and  is  located  0.6  units  from  the  left-hand  end  of  the  tunnel.  The  problem  is 
initialized  by  a  uniform,  right-going  Mach  3  flow.  Reflective  boundary  conditions 
are  applied  along  the  walls  of  the  tunnel  and  in-flow  and  out-flow  boundary  con¬ 
ditions  are  applied  at  the  entrance  (left-hand  end)  and  the  exit  (right-hand  end), 
respectively. 

The  corner  of  the  step  is  a  singularity,  which  we  study  carefully  in  our  numerical 
experiments.  Unlike  in  [92]  and  many  other  papers,  we  do  not  modify  our  scheme 
near  the  corner  in  any  way.  It  is  well  known  that  this  leads  to  an  errorneous  entropy 
layer  at  the  downstream  bottom  wall,  as  well  as  a  spurious  Mach  stem  at  the 
bottom  wall.  However,  these  artifacts  decrease  when  the  mesh  is  refined.  In  Figure 
4.7,  second  order  P 1  results  using  rectangle  triangulations  are  shown,  for  a  grid 
refinement  study  using  Ax  =  Ay  =  Ax  =  Ay  =  Ax  =  Ay  =  and 
Ax  =  Ay  =  jU  as  mesh  sizes.  We  can  clearly  see  the  improved  resolution  (especially 
at  the  upper  slip  line  from  the  triple  point)  and  decreased  artifacts  caused  by  the 
corner,  with  increased  mesh  points.  In  Figure  4.8,  third  order  P 2  results  using  the 
same  meshes  are  shown. 

To  have  a  better  idea  of  the  nature  of  the  singularity  at  the  corner,  we  display 
the  values  of  the  density  and  the  entropy  along  the  line  y  =  0.2;  note  that  the  corner 
is  located  on  this  line  at  x  =  0.6.  In  Figure  4.9,  we  show  the  results  obtained  with 
P1  and  in  Figure  4.10,  the  results  obtained  with  P2.  At  the  corner  (x  =  0.6),  we 
can  see  that  there  is  a  jump  both  in  the  entropy  and  in  the  density.  As  the  meshsize 
decreases,  the  jump  in  the  entropy  does  not  vary  significantly;  however,  the  jump 
in  the  density  does.  The  sharp  decrease  in  the  density  right  after  the  corner  can 
be  interpreted  as  a  cavitation  effect  that  the  scheme  seems  to  be  able  to  better 
approximate  as  the  meshsize  decreases. 

In  order  to  verify  that  the  erroneous  entropy  layer  at  the  downstream  bottom 
wall  and  the  spurious  Mach  stem  at  the  bottom  wall  are  both  artifacts  caused  by 
the  corner  singularity,  we  use  our  triangle  code  to  locally  refine  near  the  corner 
progressively;  we  use  the  meshes  displayed  in  Figure  4.11.  In  Figure  4.12,  we  plot 
the  density  obtained  by  the  P1  triangle  code,  with  triangles  (roughly  the  resolution 
of  Ax  =  Ay  =  except  around  the  corner).  In  Figure  4.13,  we  plot  the  entropy 
around  the  corner  for  the  same  runs.  We  can  see  that,  with  more  triangles  concen¬ 
trated  near  the  corner,  the  artifacts  gradually  decrease.  Results  with  P2  codes  in 
Figures  4.14  and  4.15  show  a  similar  trend. 


4.5  Computational  results:  Steady  state,  smooth  solutions 

In  this  section,  we  present  some  of  the  numerical  results  of  Bassi  and  Rebay  [4] 
in  two  dimensions  and  Warburton,  Lomtev,  Kirby  and  Karniadakis  [91]  in  three 
dimensions. 

The  purpose  of  the  numerical  results  of  Bassi  and  Rebay  [4]  we  are  presenting  is 
to  assess  (i)  the  effect  of  the  quality  of  the  approximation  of  curved  boundaries  and 
of  (ii)  the  effect  of  the  degree  of  the  polynomials  on  the  quality  of  the  approximate 
solution.  The  test  problem  we  consider  here  is  the  two-dimensional  steady-state, 
subsonic  flow  around  a  disk  at  Mach  number  Moo  =  0.38.  Since  the  solution  is 
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smooth  and  can  be  computed  analytically,  the  quality  of  the  approximation  can  be 
easily  assessed. 

In  the  figures  4.16,  4.17,  4.18,  and  4.19,  details  of  the  meshes  around  the  disk  are 
shown  together  with  the  approximate  solution  given  by  the  RKDG  method  using 
piecewise  linear  elements.  These  meshes  approximate  the  circle  with  a  polygonal.  It 
can  be  seen  that  the  approximate  solution  are  of  very  low  quality  even  for  the  most 
refined  grid.  This  is  an  effect  caused  by  the  kinks  of  the  polygonal  approximating 
the  circle. 

This  statement  can  be  easily  verified  by  taking  a  look  to  the  figures  4.20,  4.21, 
4.22,  and  4.23.  In  these  pictures  the  approximate  solutions  with  piecewise  linear, 
quadratic,  and  cubic  elements  are  shown;  the  meshes  have  been  modified  to  render 
exactly  the  circle.  It  is  clear  that  the  improvement  in  the  quality  of  the  approxi¬ 
mation  is  enormous.  Thus,  a  high-quality  approximation  of  the  boundaries  has  a 
dramatic  improvement  on  the  quality  of  the  approximations. 

Also,  it  can  be  seen  that  the  higher  the  degree  of  the  polynomials,  the  better 
the  quality  of  the  approximations,  in  particular  from  figures  4.20  and  4.21.  In  [4], 
Bassi  and  Rebay  show  that  the  RKDG  method  using  polynomilas  of  degree  k 
are  ( k  +  l)-th  order  accurate  for  k  =  1,2,3.  As  a  consequence,  a  RKDG  method 
using  polynomials  of  a  higher  degree  is  more  efficient  than  a  RKDG  method  using 
polynomials  of  lower  degree. 

In  [91],  Warburton,  Lomtev,  Kirby  and  Karniadakis  present  the  same  test  prob¬ 
lem  in  a  three  dimensional  setting.  In  Figure  4.24,  we  can  see  the  three-dimensional 
mesh  and  the  density  isosurfaces.  We  can  also  see  how,  while  the  mesh  is  being  kept 
fixed  and  the  degree  of  the  polynomials  k  is  increased  from  1  to  9,  the  maximum 
error  on  the  entropy  goes  exponentialy  to  zero.  (In  the  picture,  a  so-called  ‘mode’ 
is  equal  to  k  +  1). 

4.6  Concluding  remarks 

In  this  section,  we  have  extended  the  RKDG  methods  to  multidimensional  systems. 
We  have  described  in  full  detail  the  algorithms  and  displayed  numerical  results 
showing  the  performance  of  the  methods  for  the  Euler  equations  of  gas  dynamics. 

The  flexibility  of  the  RKDG  method  to  handle  nontrivial  geometries  and  to 
work  with  different  elements  has  been  displayed.  Moreover,  it  has  been  shown  that 
the  use  of  polynomials  of  high  degree  not  only  does  not  degrade  the  resolution  of 
strong  shocks,  but  enhances  the  resolution  of  the  contact  discontinuities  and  renders 
the  scheme  more  efficient  on  smooth  regions. 

Next,  we  extend  the  RKDG  methods  to  convection-dominated  problems. 
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Fig.  4.2.  Double  Mach  reflection  problem.  Second  order  P1  results.  Density  p.  30 
equally  spaced  contour  lines  from  p  =  1.3965  to  p  —  22.682.  Mesh  refinement  study. 
From  top  to  bottom.  Ax  —  Ay  —  so5  120 *  240 5  430 * 
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Fig.  4.3.  Double  Mach  reflection  problem.  Third  order  P 2  results.  Density  p.  30 
equally  spaced  contour  lines  from  p  =  1.3965  to  p  —  22.682.  Mesh  refinement  study. 
From  top  to  bottom:  Ax  =  Ay  =  5*5,  and 
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Fig.  4.4.  Double  Mach  reflection  problem.  Blowed-up  region  around  the  double 
Mach  stems.  Density  p.  Third  order  P2  with  Ax  =  Ay  =  ^  (top);  second  order 
P 1  with  Ax  =  Ay  =  ^  (middle);  and  third  order  P 2  with  Ax  =  Ay  =  -^ 
(bottom). 
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Fig.  4.5.  Double  Mach  reflection  problem.  Cut  at  y  =  0.04  of  the  blowed-up  region. 
Density  p.  Comparison  of  second  order  P1  with  third  order  P2  on  the  same  mesh 
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Fig.  4.6.  Double  Mach  reflection  problem.  Cut  at  y  =  0.04  of  the  blowed-up  region. 
Density  p.  Comparison  of  second  order  P 1  with  third  order  P2  on  a  coarser  mesh 
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Rectangles  PI ,  A  x  =  A  y  =  1/40 


0.0  O.S  1.0  1.5  2.0  2.5  3.0 


Rectangles  PI ,  A  x  =  A  y  =  1/80 


0.0  0.S  1.0  1.6  2.0  2.S  3.0 


Rectangles  PI,  Ax  =  Ay  =  1/160 


0.0  0.5  1.0  15  2.0  25  3.0 


Fig.  4.7.  Forward  facing  step  problem.  Second  order  P1  results.  Density  p.  30 
equally  spaced  contour  lines  from  p  =  0.090338  to  p  =  6.2365.  Mesh  refinement 
study.  From  top  to  bottom:  Ax  =  Ay  =  and 
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Rectangles  P2,  A  x  =  A  y  =  1/40 


Fig.  4.9.  Forward  facing  step  problem.  Second  order  P1  results.  Values  of  the  den¬ 


sity  and  entropy  along  the  line  y  =  .2.  Mesh  refinement  study.  From  top  to  bottom: 

^ X  —  ^ y  ~  40  >  80’  160’  an(^  320' 
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Rectangles  P2,  A  x  =  A  y  =  1/40 


Rectangles  P2,  Ax  =  Ay  =  1/80  j « 


Rectangles  P2,  A  x  =  A  y  =  1/1 60 


Rectangles  P2,  Ax  =  Ay  =  1/320 


Fig.  4.10.  Forward  facing  step  problem.  Third  order  P2  results.  Values  of  the  den¬ 
sity  and  entropy  along  the  line  y  =  .2.  Mesh  refinement  study.  From  top  to  bottom 
Ax  =  Ay  =  40  i  80  1  160’  an<^  320' 


’  and  350' 


Fig.  4.11.  Forward  facing  step  problem.  Detail  of  the  triangulations  associated  with 
the  different  values  of  a.  The  parameter  a  is  the  ratio  between  the  typical  size  of 
the  triangles  near  the  corner  and  that  elsewhere. 


.48 


Fig.  4.12.  Forward  facing  step  problem.  Second  order  P 1  results.  Density  p.  30 
equally  spaced  contour  lines  from  p  =  0.090338  to  p  =  6.2365.  Triangle  code. 
Progressive  refinement  near  the  corner 
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Fig.  4.13.  Forward  facing  step  problem.  Second  order  P 1  results.  Entropy  level 
curves  around  the  corner.  Triangle  code.  Progressive  refinement  near  the  corner 


Fig.  4.14.  Forward  facing  step  problem.  Third  order  P 2  results.  Density  p.  30 
equally  spaced  contour  lines  from  p  =  0.090338  to  p  =  6.2365.  Triangle  code. 
Progressive  refinement  near  the  corner 


Discontinuous  Galerkin  Methods  151 


Fig.  4.15.  Forward  facing  step  problem.  Third  order  P 1  results.  Entropy  level 
curves  around  the  corner.  Triangle  code.  Progressive  refinement  near  the  corner 
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Fig.  4.17.  Grid  “32  x  8”  with  a  piecewise  linear  approximation  of  the  circle  (top) 
and  the  corresponding  solution  (Mach  isolines)  using  P1  elements  (bottom). 
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Y 


X 


Modes 

Fig.  4.24.  Three-dimensional  flow  over  a  semicircular  bump.  Mesh  and  density 
isosurfaces  (top)  and  history  of  convergence  with  p-refinement  of  the  maximum 
entropy  generated  (bottom).  The  degree  of  the  polynomial  plus  one  is  plotted  on 
the  ‘modes’  axis. 


Discontinuous  Galerkin  Methods  161 


5  The  Hamilton- Jacobi  equations  in  several  space 
dimensions 

5.1  Introduction 

In  this  chapter,  we  consider  the  RKDG  method  for  multidimensional  Hamilton- 
Jacobi  equations.  The  model  problem  we  consider  is  the  following: 

<pt  +  H('Px,'pv)  =  0,  in  (0,  l)2  x  (0,T),  (5.1) 

ip{x,y,0)  =  <p0{x,y),  V  {x,y)  6  (0,  l)2,  (5.2) 

where  we  take  periodic  boundary  conditions.  The  material  in  this  section  is  based 
in  the  work  of  Hu  and  Shu  [43]. 


5.2  The  RKDG  method 


As  in  the  one-dimensional  case,  the  main  idea  to  extend  the  RKDG  method  to  this 
case,  is  to  realize  that  u  =  tpx  and  v  =  ipy  satisfy  the  following  problem: 


ut  +  H(u,v)x  =  0,  in  (0,  l)2  x  (0,T),  (5.3) 

Vt  +  H(u,  v)y  =  0,  in  (0,  l)2  x  (0,  T),  (5.4) 

u(x,y,0)  =  (ifio)x(x,y),  V  (x,y)  6  (0,  l)2,  (5.5) 

v(x,y,0)  =  (<po)y(x,y),  V  (x,y)  6  (0,  l)2,  (5.6) 

and  that  can  be  computed  from  u  and  v  by  solving  the  following  problem: 

<pt  =  -H(u,  t>),  in  (0, 1)  x  (0,T),  (5.7) 

<p(x,o)  =  ip0(x),  Vie  (o,i).  (5.8) 


Again,  a  straightforward  application  of  the  RKDG  method  to  (5.3),  (5.5),  produces 
an  approximation  Uh  to  u  =  <px\  and  a  straightforward  application  of  the  RKDG 
method  to  (5.4),  (5.6),  produces  an  approximation  Vh  to  v  =  <py.  Both  Uh  and  Vh 
are  taken  to  be  piecewise  polynomials  of  degree  ( k  —  1).  Then,  tph  is  computed  in 
one  of  the  following  ways: 


(i)  Take  t)  in  V*  such  that 
VK  e  Th,  whePk(K): 

/  dt<fih(3:,y,t)wh(x)dxdy=-  H(uh(x,t),vh(x,y))wh(x,y)  dx dy 

Jk  Jk 

/  <p(x,y,0)wh(x,y)dxdy  =  /  <p0{x,y)wh(x,y)  dx dy. 

Jk  Jk 


(ii)  Take  in  Vjf  such  that,  VK  G  Th'. 

\\V<fih  -  (uh,vh)\\L2(K)  =  min  \\Vip  -  (uh,vh)\\L2(K). 

i>€Pk(K) 
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This  determines  <fih  up  to  a  constant.  To  find  this  constant,  we  impose  the  following 
condition: 


VKeTh-. 


^j.  JK(Ph(x,y>t)dxdy=  -  J  H(uh(x,y,t),vh(x,y,t))dxdy, 

/  ip(x,y,0)dxdy  =  /  p0(x,  y)  dx  dy. 

Jk  Jk 


(iii)  Pick  the  element  Kj  and  determine  the  values  of  <fh  on  it  in  such  a  way 


that 


and  that 


I  V<Ph-{uh,vh)\\LHK  )=  min  ||  Vip  -  (uh,vh)  \\L2{K 

ipePk(.K)  1  •" 


-jl  [  <fih(x,y,t)dxdy  =  -  [  H(uh(x,y,t))  dx  dy, 
dt  Jkj  Jkj 

/  p{x,y,0)dxdy  =  /  p0(x,y)  dx  dy. 

Jki  Jkt 


(5.9) 


Then,  compute  iph  as  follows: 


<p(B,  t)  =  <p(A,  t)+  (y>x  dx  +  ifiy  dy) . 
Ja 


(5.10) 


to  determine  the  missing  constant.  The  path  should  be  taken  to  avoid  crossing  a 
derivative  discontinuity,  if  possible. 

We  remark  again  that,  in  the  third  approach,  the  recovered  values  of  p>h  depend 
on  the  choice  of  the  starting  point  A  as  well  as  the  integration  path.  However 
this  difference  is  on  the  level  of  truncation  errors  and  does  not  affect  the  order  of 
accuracy  as  is  shown  in  the  computational  results  we  show  next. 


5.3  Computational  results 

The  purpose  of  the  numerical  experiments  we  report  in  this  section  is  to  asses  the 
accuracy  of  the  method,  to  see  if  the  generalized  slope  limiter  is  actually  needed, 
and  to  evaluate  the  effect  of  changing  the  integration  path.  The  third  approach  is 
used. 

First  test  problem.  Two  dimensional  Burgers’  equation: 

ipt+(‘P*  +  Vy  +  l)2  =0,  in  ( — 2, 2)2  x  (0,T), 

p(x,  y,  0)  =  -  cos  +  j  ;  v  (x,  y)  €  (-2, 2)2, 
with  periodic  boundary  conditions. 

We  first  use  uniform  rectangular  meshes  and  the  local  Lax-Friedrichs  flux.  At 
T  =  0.5/ 7T2 ,  the  solution  is  still  smooth.  The  errors  (computed  at  the  center  of  the 
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cells)  and  orders  of  accuracy  are  listed  in  Table  5.1.  It  seems  that  only  fc-th  order  of 
accuracy  is  achieved  when  tp  is  a  piecewise  polynomial  of  degree  k.  Next,  as  in  the 
one  dimensional  case,  we  use  non-uniform  rectangular  meshes  obtained  from  the 
tensor  product  of  one  dimensional  nonuniform  meshes  (the  meshes  in  two  directions 
are  independent).  Again,  we  give  the  “real”  L2 -errors  computed  by  a  6  x  6  point 
Gaussian  quadrature  as  well.  The  results  are  shown  in  Table  5.2. 

The  results  in  Tables  5.1  and  5.1  axe  obtained  by  updating  the  element  at  the 
left-lower  corner  with  time,  and  then  taking  an  integration  path  consisting  of  line 
segments  starting  from  the  corner  and  parallel  to  the  x-axis  first,  then  vertically  to 
the  point.  To  further  address  the  issue  of  the  dependency  of  the  computed  values  of 
the  solution  tp  on  the  integration  path  and  starting  point,  we  use  another  path  which 
starts  vertically,  then  parallelly  with  the  x-axis  to  reach  the  point.  In  Table  5.3,  we 
list  the  difference  of  two  recovered  solutions  tp  from  these  two  different  integration 
paths,  for  the  non-uniform  mesh  cases.  We  can  see  that  these  differences  are  at  the 
levels  of  local  truncation  errors  and  decay  in  the  same  order  as  the  errors.  Thus  the 
choice  of  integration  path  in  recovering  p  does  not  affect  accuracy. 

At  T  =  1.5/7 r2,  the  solution  has  discontinuous  derivatives.  Fig.  5.1  is  the  graph 
of  the  numerical  solution  with  40  x  40  elements  (uniform  mesh). 

Finally  we  use  triangle  based  triangulation,  the  mesh  with  ft  =  1  is  shown 
in  Fig.  5.2.  The  accuracy  at  T  =  0.5/7T2  is  shown  in  Table  5.4.  Similar  accuracy 
pattern  is  observed  as  in  the  rectangular  case.  The  result  at  T  =  1.5/7T2,  when  the 
derivative  is  discontinuous,  is  shown  in  Fig.  5.3. 


Table  5.1.  Accuracy  for  2D  Burgers  equation,  uniform  rectangular  mesh,  T  = 
0.5/7T2. 
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Table  5.2.  Accuracy  for  2D  Burgers  equation,  non-uniform  rectangular  mesh,  T  = 
0.5/tt. 
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3.85E-05 

1.76E-06 

2.817 

pl  _ 

pi 

pi 

NxN 

L°°  error 

L°°  error 

12223 

L°°  error 

- 

4.68E-02 

- 

1.00E-02 

20  x  20 

1.25E-01 

nreeii 

mm 

40  x  40 

5.74E-02 

3.54E-03 

1.797 

2.29E-04 

jfp# 

80  x  80 

2.78E-02 

00 

1.15E-03 

fwm 

5.11E-05 

160  x  160 

1.42E-02 

2.72E-04 

7.16E-06 

Table  5.3.  Differences  of  the  solution  9?  recovered  by  two  different  integration 
paths,  non-uniform  mesh,  Burgers  equation. 


Pl 

pi 

P6 

NxN 

2 

LL  error 

12051111 

10  x  10 

8.61E-03 

— 

2.90E-03 

1.15E-03 

— 

20  x  20 

4.64E-03 

1.28E-03 

nsm 

2.237 

4.12E-04 

3.76E-05 

eeeei 

1.39E-04 

6.71E-06 

160  x  160 

1.09E-03 

3.66E-05 

1.925 

8.79E-07 

2.932 
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Table  5.4.  Accuracy  for  2D  Burgers  equation,  triangular  mesh,  T  =  0.5/7T2. 


P2 

P 3 

h 

L 1  error 

order 

L°°  error 

order 

Ll  error 

order 

L°°  error 

order 

1 

5.48E-02 

— 

1.52E-01 

— 

1.17E-02 

— 

2.25E-02 

— 

1/2 

1.35E-02 

2.02 

6.26E-02 

1.28 

1.35E-03 

3.12 

4.12E-03 

2.45 

1/4 

2.94E-03 

2.20 

1.55E-02 

2.01 

1.45E-04 

3.22 

4.31E-04 

3.26 

1/8 

6.68E-04 

2.14 

3.44E-03 

2.17 

1.71E-05 

3.08 

7.53E-05 

2.52 

Second  test  problem.  We  consider  the  following  problem: 

(fit  —  cos(fix  +  fiy  +  1)  =  0,  in  (—2, 2)2  x  (0,  T), 

0)  =  -  cos  ,  V  (x,y)  €  (—2, 2)2, 

with  periodic  boundary  conditions. 

For  this  example  we  use  uniform  rectangular  meshes.  The  local  Lax-Friedrichs 
flux  is  used.  The  solution  is  smooth  at  T  =  0.5/7 r2.  The  accuracy  of  the  numerical 
solution  is  shown  in  Table  5.5. 


Table  5.5.  Accuracy,  2D,  H(u,v)  =  —  cos(u  +  v  +  1),T  =  0.5/ir2. 


PL 

P< 

PA  ! 

NxN 

L1  error 

order 

Ll  error 

order 

L 1  error 

order 

10  x  10 

6.47E-02 

— 

8.31E-03 

— 

1.35E-02 

— 

20  x  20 

2.54E-02 

1.349 

1.93E-03 

2.106 

1.57E-03 

3.104 

40  x  40 

1.05E-02 

1.274 

4.58E-04 

2.075 

2.39E-04 

2.716 

80  x  80 

4.74E-03 

1.147 

1.13E-04 

2.019 

2.89E-05 

3.048 

160  x  160 

2.23E-03 

1.088 

2.83E-05 

1.997 

4.38E-06 

2.722 

Pl 

p-2 

pi 

NxN 

L°°  error 

order 

L°°  error 

order 

L°°  error 

order 

10  x  10 

1.47E-01 

— 

1.88E-02 

— 

2.36E-02 

— 

20  x  20 

6.75E-02 

1.123 

7.34E-03 

1.357 

3.44E-03 

2.778 

40  x  40 

2.65E-02 

1.349 

1.83E-03 

2.004 

4.59E-04 

2.906 

80  x  80 

1.18E-02 

1.167 

4.55E-04 

2.008 

5.78E-05 

2.989 

160  x  160 

2.23E-03 

1.088 

1.13E-04 

2.010 

8.54E-06 

2.759 

The  solution  has  developed  a  discontinuous  derivative  at  T  =  1.5/ -jt2  .  Results 
with  40  x  40  elements  are  shown  in  Fig.  5.4. 
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Third  test  problem.  The  level  set  equation  in  a  domain  with  a  hole: 

(ft  +  sign(ip0)(yj '<pl  +  tp\  -  1)  =  0,  in  12  x  (0,T), 

<p{x,y,  0)  =  -  cos  ,  V(x,y)en, 

where  ft  —  {(*,  y)  :  1/2  <  ^Jx2  +  y 2  <  1}. 

This  problem  was  introduced  in  [84].  Its  exact  solution  <p  has  the  same  zero  level 
set  as  ipo,  and  the  steady  state  solution  is  the  distance  function  to  that  zero  level 
curve.  We  use  this  problem  to  test  the  effect  on  the  accuracy  of  the  approximation 
of  using  various  integration  paths  (5.10)  when  there  is  a  hole  in  the  region.  Notice 
that  the  exact  steady  state  solution  is  the  distance  function  to  the  inner  boundary 
of  domain  when  boundary  condition  is  adequately  prescribed.  We  compute  the  time 
dependent  problem  to  reach  a  steady  state  solution,  using  the  exact  solution  for 
the  boundary  conditions  of  <px  and  tpy.  Four  symmetric  elements  near  the  outer 
boundary  are  updated  by  (5.9),  all  other  elements  are  recovered  from  (5.10)  by  the 
shortest  path  to  the  nearest  one  of  above  four  elements.  The  results  are  shown  in 
Table  5.6.  Also  shown  in  Table  5.6  is  the  error  (difference)  between  the  numerical 
solution  p  thus  recovered,  and  the  value  of  p  after  another  integration  along  a 
circular  path  (starting  and  ending  at  the  same  point  in  (5.10)).  We  can  see  that 
the  difference  is  small  with  the  correct  order  of  accuracy,  further  indicating  that 
the  dependency  of  the  recovered  solution  tp  on  the  integration  path  is  on  the  order 
of  the  truncation  errors  even  for  such  problems  with  holes.  Finally,  the  mesh  with 
1432  triangles  and  the  solution  with  5608  triangles  are  shown  in  Fig.  5.5. 


Table  5.6.  Errors  for  the  level  set  equation,  triangular  mesh  with  P2 . 


Errors  for  the  Solution 

Errors  by  Integration  Path  | 

L°°  error 

403 

nrngasra 

— 

1.32E-03 

— 

|1.61E-04j 

— | 

5.71E-04 

— 

1432 

3.05 

2.73E-04 

wm 

1.68E-04 

1.78 

5608 

1 1 .71E-05 1 

2.85 

3.18E-05 

ESQ 

9.32E-06 

Hi] 

EES 

3.03 

5.01E-06 

2.67 

1.43E-06 

6.63E-06 

ESQ 

Fourth  test  problem.  Two  dimensional  Riemann  problem: 

tpt  +  sm(tpx  +  tpy)  =  0,  in  (-1,  l)1  x  (0,  T), 

<p(x,y,0)  =  n(\y\  -  |x|),  V  ( x,y )  G  (-1,1)2, 

For  this  example  we  use  a  uniform  rectangular  mesh  with  40  x  40  elements. 
The  local  Lax-Friedrichs  flux  is  used.  As  was  mentioned  in  Example  4.3,  we  have 
found  out  that  a  nonlinear  limiting  is  needed,  for  convergence  towards  an  viscosity 
solution.  We  show  the  numerical  solution  at  T  =  1  in  Fig.  5.6. 
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Fifth  test  problem.  A  problem  from  optimal  control  [73]: 

< fit  +  (sin  y)tpx  +  (sin x  +  sign(py))<py  =  |  sin2  y  +  (1  -  cosx), 
ip(x,y,0)  =  0, 

where  the  space  domain  is  ( — 7r,  7r)2  and  the  boundary  conditions  are  periodic.  We 
use  a  uniform  rectangular  mesh  of  40  x  40  elements  and  the  local  Lax-Friedrichs  flux. 
The  solution  at  T  =  1  is  shown  in  Fig.  5.7,  while  the  optimal  control  w  =  signify) 
is  shown  in  Fig.  5.8. 

Notice  that  our  method  computes  Vp  as  an  independent  variable.  It  is  very 
desirable  for  those  problems  in  which  the  most  interesting  features  are  contained 
in  the  first  derivatives  of  tp,  as  in  this  optimal  control  problem. 

Sixth  test  problem.  A  problem  from  computer  vision  [78]: 

ipt  +  I(x,y)^l  +  pi  +  pi -1=0,  in  (-1,1)2  x  (0,T), 

ip{x,y,0)  =  0,  V(*,y)€(- 1,1)2, 

with  p  =  0  as  the  boundary  condition.  The  steady  state  solution  of  this  problem  is 
the  shape  lighted  by  a  source  located  at  infinity  with  vertical  direction.  The  solution 
is  not  unique  if  there  are  points  at  which  I(x,  y)  =  1.  Conditions  must  be  prescribed 
at  those  points  where  I(x,y)  =  1.  Since  our  method  is  a  finite  element  method,  we 
need  to  prescribe  suitable  conditions  at  the  correspondent  elements.  We  take 

I{x,y)  =  1/V1  +  (1  ~  M)2  +  (1  -  \y\ )2  (5.1) 

The  exact  steady  solution  is  p(x,y,o o)  =  (1  —  |x|)(l  —  |y|).  We  use  a  uniform 
rectangular  mesh  of  40  x  40  elements  and  the  local  Lax-Friedrichs  flux.  We  impose 
the  exact  boundary  conditions  for  u  =  px,v  =  py  from  the  above  exact  steady 
solution,  and  take  the  exact  value  at  one  point  (the  lower  left  corner)  to  recover 
p.  The  results  for  P2  and  P3  are  presented  in  Fig.  5.3,  while  Fig.  5.9  contains  the 
history  of  iterations  to  the  steady  state. 

Next  we  take 


I(x,y )  =  l/\/l  +  4y2(l  —  x2)2  +  4x2(l  —  y2)2  (5.2) 

The  exact  steady  solution  is  p(x, y, oo)  =  (1  —  x2)(l  —  y2).  We  again  use  a  uniform 
rectangular  mesh  of  40  x  40  elements,  the  local  Lax-Friedrichs  flux,  impose  the  exact 
boundary  conditions  for  u  =  px ,  v  —  <py  from  the  above  exact  steady  solution,  and 
take  the  exact  value  at  one  point  (the  lower  left  corner)  to  recover  tp.  A  continuation 
method  is  used,  with  the  steady  solution  using 

Ie(x,y)  =  l/\/l  +  4y2(l  —  x2)2  +  4x2(l  —  y2)2  +  e  (5.3) 

for  bigger  e  as  the  initial  condition  for  smaller  e.  The  sequence  of  e  used  are  e  = 
0.2,  0.05,  0.  The  results  for  P2  and  P3  are  presented  in  Fig.  5.10. 
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P2,  h  =  1/8  P3,  h  =  1/8 


Fig.  5.3.  Two  dimension  Burgers’  equation,  triangular  mesh,  T=1.5/7r2. 


Fig.  5.4.  Two  dimensional,  H(u,v )  =  —  cos(u  +  v  +  1),T  =  1.5/n2. 
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Fig.  5.5.  The  level  set  equation,  P2. 


Pa,  40x40  elements 


P3,  40x40  elements 


Fig.  5.6.  Two  dimensional  Riemann  problem,  H(u,  v)  =  sin(u  +  v),T  =  1. 
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P2,  40x40  elements 


P  ,  40x40  elements 
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Fig.  5.9.  Computer  vision  problem,  history  of  iterations. 


Pa,  40x40  elements 


P3,  40x40  elements 


Fig.  5.10.  Computer  vision  problem,  (p(x ,  y,  oo)  =  (1  —  x2)(l  —  y2)- 
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6  Convection  diffusion:  The  LDG  method 

6.1  Introduction 

In  this  chapter,  which  follows  the  work  by  Cockburn  and  Shu  [27],  we  restrict 
ourselves  to  the  semidiscrete  LDG  methods  for  convection-diffusion  problems  with 
periodic  boundary  conditions.  Our  aim  is  to  clearly  display  the  most  distinctive 
features  of  the  LDG  methods  in  a  setting  as  simple  as  possible;  the  extension  of  the 
method  to  the  fully  discrete  case  is  straightforward.  In  §2,  we  introduce  the  LDG 
methods  for  the  simple  one-dimensional  case  d  =  1  in  which 

F  (u,Du)  =  f(u)  —  a(u)dxu , 

u  is  a  scalar  and  a(u)  >  0  and  show,  in  §3,  some  preliminary  numerical  results 
displaying  the  performance  of  the  method.  In  this  simple  setting,  the  main  ideas  of 
how  to  device  the  method  and  how  to  analyze  it  can  be  clearly  displayed  in  a  simple 
way.  Thus,  the  L2-stability  of  the  method  is  proven  in  the  general  nonlinear  case 
and  the  rate  of  convergence  of  (Ax)k  in  the  L°°(0,T;L2)-norm  for  polynomials  of 
degree  k  >  0  in  the  linear  case  is  obtained;  this  estimate  is  sharp.  In  §4,  we  extend 
these  results  to  the  case  in  which  u  is  a  scalar  and 

Fi  ( u ,  Du)  =  fi  ( u )  -  ad  (w)  u> 

1  <j<d 

where  oy  defines  a  positive  semidefinite  matrix.  Again,  the  L2-stability  of  the 
method  is  proven  for  the  general  nonlinear  case  and  the  rate  of  convergence  of 
(Ax)k  in  the  L°°(0,T;L2)-nonn  for  polynomials  of  degree  k  >  0  and  arbitrary  tri¬ 
angulations  is  proven  in  the  linear  case.  In  this  case,  the  multidimensionality  of  the 
problem  and  the  arbitrariness  of  the  grids  increase  the  technicality  of  the  analysis 
of  the  method  which,  nevertheless,  uses  the  same  ideas  of  the  one-dimensional  case. 
In  §5,  the  extension  of  the  LDG  method  to  multidimensional  systems  is  briefly 
described  and  in  §6,  some  numerical  results  for  the  compressible  Navier-Stokes 
equations  from  the  paper  by  Bassi  and  Eebay  [3]  and  from  the  paper  by  Lomtev 
and  Karniadakis  [63]  are  presented. 

6.2  The  LDG  methods  for  the  one- dimensional  case 

In  this  section,  we  present  and  analyze  the  LDG  methods  for  the  following 
model  problem: 

dt  u  +  dx  {f{u)  -  a(u)  dx  u)  =  0  in  Q, 
u(t  =  0)  =  uo  on  (0, 1), 

where  Q  =  (0 ,T)  x  (0, 1),  with  periodic  boundary  conditions. 

General  formulation  and  main  properties  To  define  the  LDG  method, 
we  introduce  the  new  variable  q  =  \J a(u)  dx  u  and  rewrite  the  problem  (6.1),  (6.2) 
as  follows: 


dt  u  +  dx  if{u)  - 

y/a(u)q)  =  0  in  Q, 

(6.3) 

q-dx  g(u)  =  0 

in  Q, 

(6.4) 

u(t  =  0)  =  uo, 

on  (0, 1), 

(6.5) 

simple 

(6.1) 

(6.2) 
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where  g(u)  =  fu  yf  a(s)  ds.  The  LDG  method  for  (6.1),  (6.2)  is  now  obtained  by 
simply  discretizing  the  above  system  with  the  Discontinuous  Galerkin  method. 

To  do  that,  we  follow  [24]  and  [21].  We  define  the  flux  h  =  (hu,hq  )*  as  follows: 

h(u,  q )  =  ( /(«)  -  y/a(u )  q ,  -g(u)  )*.  (6.6) 

For  each  partition  of  the  interval  (0, 1),  {  Xj+1/2  }jLo,  we  set,  for  j  =  1, . . . ,  N: 

Ij  =  (xi  —  l/2i  xj+l/2)t  Axj  =  Xj+i/2  —  Xj  — 1/2  j  (6.7) 


and 


Ax  =  max  Axi. 

l<j<N 


(6.8) 


We  seek  an  approximation  w  h  =  (uh,qhY  tow  =  (u,  q)1  such  that  for  each  time 
t  G  [0,T],  both  Uh(t)  and  qh(t)  belong  to  the  finite  dimensional  space 


Vh  =  Vh  =  {u  €  L^O,  1)  :  v\ij  €  Pk(Ij),  j  =  l,...,N},  (6.9) 


where  Pk(I)  denotes  the  space  of  polynomials  in  I  of  degree  at  most  k.  In  order 
to  determine  the  approximate  solution  ( Uh,qh ),  we  first  note  that  by  multiplying 
(6.3),  (6.4),  and  (6.5)  by  arbitrary,  smooth  functions  vu,  vq,  and  Vi,  respectively, 
and  integrating  over  Ij,  we  get,  after  a  simple  formal  integration  by  parts  in  (6.3) 
and  (6.4), 


f,  dt  u(x,  t)  vu(x)  dx  —  /,  hu(yv{x,t))dxvu(x)dx 

Ij  Ij 

+hu{w{xj+1/2,t))vu{xJ+1/2)  -  Mw(Zj_1/2,t))uu(a:t_1/2)  =  0,  (6.10) 

fj.  q{x,t)vq{x)dx  -  J  hq(w(x,t))dxvq(x)dx 
+hq(-w(xj+1/2,t))vq(x~+1/2)  -  hq(vf(xj-i/2,t))  ug(a;t_1/2)  =  0,  (6.11) 

fr  u(x,  0)  Vi(x)  dx  =  Jj  uo(x)  Vi(x)  dx.  (6-12) 


Next,  we  replace  the  smooth  functions  vu,vq,  and  Vi  by  test  functions  «/,,«,  Vh,q,  and 
Vh,i,  respectively,  in  the  finite  element  space  Vh  and  the  exact  solution  w  =  (u,  q)1 
by  the  approximate  solution  w h  =  (uh,  qu)1  ■  Since  this  function  is  discontinuous 
in  each  of  its  components,  we  must  also  replace  the  nonlinear  flux  h(w(xj+1/2,  *)) 
by  a  numerical  flux  h(w)J+1/2(f)  =  (hu(vfh)j+i/2{t),  h9(wh)J+1/2(t))  that  will  be 
suitably  chosen  later.  Thus,  the  approximate  solution  given  by  the  LDG  method  is 
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defined  as  the  solution  of  the  following  weak  formulation: 

V  vhlU  G  Pk(Ij)  : 

/  dtuh(x,t)vh,u(x)dx  -  /  hu(\Vh(x,t))dxVh,u(x)dx 
Jij 

+hu(yfh)j+i/2(t)  vhtU(xj+1/2)  -  fe«(wfc)i_1/2(t)t>h,1t(a:t_1/2)  =  0,  (6.13) 

/  qh(x,t)vh,q(x)dx  -  /  hq(wh(x,t))dxVh,q{x)dx 

Jh 

+hq(wh)j+1/2{t)  vhlq(xJ+1/2)  -  hq(vrh)j-.1/2(t)  Vh,q(xf_1/2)  =  0,  (6.14) 


V  uhli  G  P*(J,)  : 


/  (uh(a5,0)  —  tio(aO)v/>,i(z)tfz  =  0 
Jij 


It  only  remains  to  choose  the  numerical  flux  h(wfc)3+1/2(<).  We  use  the  notation: 

[p]=p+-p~,  p=^(p++p~),  pt+i/2=P(xf+1,2)- 

To  be  consistent  with  the  type  of  numerical  fluxes  used  in  the  RKDG  methods,  we 
consider  numerical  fluxes  of  the  form 

h(wft)J+i/2(t)  =  h(wfc  (x~+1/2 ,  t) ,  Wft  (Xj~+1  /2,t)), 

that: 

(i)  Are  locally  Lipschitz  and  consistent  with  the  flux  h, 

(ii)  Allow  for  a  local  resolution  of  qu  in  terms  of  Uh, 

(iii)  Reduce  to  an  E-flux  (see  Osher  [71])  when  a(-)  =  0,  and  that  (iv)  enforce  the 
Instability  of  the  method. 

To  reflect  the  convection-diffusion  nature  of  the  problem  under  consideration, 
we  write  our  numerical  flux  as  the  sum  of  a  convective  flux  and  a  diffusive  flux: 

h(w~,  w+)  =  hconu(w_,w+)  +  hdiff{vr~,  w+).  (6.16) 

The  convective  flux  is  given  by 

hcon„(w_,  w+)  =  (/(ii-,M+),0)‘,  (6.17) 

where  f(u~,u+)  is  any  locally  Lipschitz  E-flux  consistent  with  the  nonlinearity  /, 
and  the  diffusive  flux  is  given  by 


h di//(w  ,w+)  =  (-  ^^9,  -g(u)  )*  -Cdiff[ w], 


(6.18) 
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where 


CMff  ~  (-C12  V)  ’ 

(6.19) 

C12  =  ci2(w_,  w+)  is  locally  Lipschitz, 

(6.20) 

C12  =  0  when  a(  )  =  0. 

(6.21) 

We  claim  that  this  flux  satisfies  the  properties  (i)  to  (iv) . 

Let  us  prove  our  claim.  That  the  flux  h  is  consistent  with  the  flux  h  easily 
follows  from  their  definitions.  That  h  is  locally  Lipschitz  follows  from  the  fact  that 
/(•,  •)  is  locally  Lipschitz  and  from  (6.19);  we  assume  that  /(•)  and  a(-)  are  locally 
Lipschitz  functions,  of  course.  Property  (i)  is  hence  satisfied. 

That  the  approximate  solution  qh  can  be  resolved  element  by  element  in  terms 
of  Uh  by  using  (6.14)  follows  from  the  fact  that,  by  (6.18),  the  flux 

hq  =  - g(u )  -  C12  [it] 

is  independent  of  qh  .  Property  (ii)  is  hence  satisfied. 

Property  (iii)  is  also  satisfied  by  (6.21)  and  by  the  construction  of  the  convective 
flux. 

To  see  that  the  property  (iv)  is  satisfied,  let  us  first  rewrite  the  flux  h  in  the 
following  way: 

h(w-,w+)  =  (1^1  -  MfjUg,  -C[w], 

where 

c  =(-«,?)•  c“  =  r(IWJ-/(""'“+))'  <6-22) 

with  4>{u )  defined  by  <f>(u)  =  fu  f(s)ds.  Since  /(•,  •)  is  an  E-flux, 

cn  =  ppp-  ( f(s )  -  f(u~,u+))ds  >  0, 

and  so,  by  (6.19),  the  matrix  C  is  semipositive  definite.  The  property  (iv)  follows 
from  this  fact  and  from  the  following  result. 


Proposition  26.  (Stability)  We  have, 


f  u2h(x,T)dx  +  f  f  ql{x,t)dxdt  +  0T,c([wh])  [  v%(x)dx, 

Jo  Jo  Jo  *  Jo 


•where  @r,c([wh]  is  the  following  expression: 
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For  a  proof,  see  the  appendix.  Thus,  this  shows  that  the  flux  h  under  consider¬ 
ation  does  satisfy  the  properties  (i)  to  (iv)-  as  claimed. 

Now,  we  turn  to  the  question  of  the  quality  of  the  approximate  solution  defined 
by  the  LDG  method.  In  the  linear  case  f'  =  c  and  a(  )  =  a,  from  the  above  stability 
result  and  from  the  the  approximation  properties  of  the  finite  element  space  14, 
we  can  prove  the  following  error  estimate.  We  denote  the  L2(0,  l)-norm  of  the  f-t.h 
derivative  of  u  by  |  u  |/. 


Theorem  27.  (Error  estimate)  Let  e  be  the  approximation  error  w  —  w /, .  Then 
we  have, 


|  eu(x,T)  |2  dx  + 


s  1/2 

|  eq(x,t)  j2  dxdt  +  0r,c([e])  >  <  C(Ax)k, 


where  C  =  C(k,  \  u  |fc+i,  |  u \k+2).  In  the  purely  hyperbolic  case  a  =  0,  the  constant 
C  is  of  order  ( Ax )1'2.  In  the  purely  parabolic  case  c  =  0,  the  constant  C  is  of  order 
Ax  for  even  values  of  k  for  uniform  grids  and  for  C  identically  zero. 


For  a  proof,  see  the  appendix.  The  above  error  estimate  gives  a  suboptimal 
order  of  convergence,  but  it  is  sharp  for  the  LDG  methods.  Indeed,  Bassi  et  al  [5] 
report  an  order  of  convergence  of  order  k  +  1  for  even  values  of  k  and  of  order  k  for 
odd  values  of  k  for  a  steady  state,  purely  elliptic  problem  for  uniform  grids  and  for 
C  identically  zero.  The  numerical  results  for  a  purely  parabolic  problem  that  will 
be  displayed  later  lead  to  the  same  conclusions;  see  Table  5  in  the  section  §2.b. 

The  error  estimate  is  also  sharp  in  that  the  optimal  order  of  convergence  of 
k  +  1/2  is  recovered  in  the  purely  hyperbolic  case,  as  expected.  This  improvement 
of  the  order  of  convergence  is  a  reflection  of  the  semipositive  definiteness  of  the 
matrix  C,  which  enhances  the  stability  properties  of  the  LDG  method.  Indeed,  in 
the  purely  hyperbolic  case,  the  quantity 


/  H  j  K(0fcn  K(01  f  <*t, 

J°  i <j<N  *•  Jj+l/2 

is  uniformly  bounded.  This  additional  control  on  the  jumps  of  the  variable  Uh  is 
reflected  in  the  improvement  of  the  order  of  accuracy  from  k  in  the  general  case  to 
k  +  1/2  in  the  purely  hyperbolic  case. 

However,  this  can  only  happen  in  the  purely  hyperbolic  case  for  the  LDG  meth¬ 
ods.  Indeed,  since  cn  =  0  for  c  =  0,  the  control  of  the  jumps  of  Uh  is  not  enforced 
in  the  purely  parabolic  case.  As  indicated  by  the  numerical  experiments  of  Bassi  et 
al.  [5]  and  those  of  section  §2.b  below,  this  can  result  in  the  effective  degradation  of 
the  order  of  convergence.  To  remedy  this  situation,  the  control  of  the  jumps  of  uh 
in  the  purely  parabolic  case  can  be  easily  enforced  by  letting  cn  be  strictly  positive 
if  |  c  |  +  |  a  |  >  0.  Unfortunately,  this  is  not  enough  to  guarantee  an  improvement 
of  the  accuracy:  an  additional  control  on  the  jumps  of  qh  is  required!  This  can  be 
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easily  achieved  by  allowing  the  matrix  C  to  be  symmetric  and  positive  definite  when 
a  >  0.  In  this  case,  the  order  of  convergence  of  k  +  1/2  can  be  easily  obtained  for 
the  general  convection-diffusion  case.  However,  this  would  force  the  matrix  entry 
C22  to  be  nonzero  and  the  property  (ii)  of  local  resolvability  of  qn  in  terms  of  Uh 
would  not  be  satisfied  anymore.  As  a  consequence,  the  high  parallelizability  of  the 
LDG  would  be  lost. 

The  above  result  shows  how  strongly  the  order  of  convergence  of  the  LDG 
methods  depend  on  the  choice  of  the  matrix  C.  In  fact,  the  numerical  results  of 
section  §2.b  in  uniform  grids  indicate  that  with  yet  another  choice  of  the  matrix 
C,  see  (6.23),  the  LDG  method  converges  with  the  optimal  order  of  k  +  1  in  the 
general  case.  The  analysis  of  this  phenomenon  constitutes  the  subject  of  ongoing 
work. 


6.3  Numerical  results  in  the  one-dimensional  case 

In  this  section  we  present  some  numerical  results  for  the  schemes  discussed  in 
this  paper.  We  will  only  provide  results  for  the  following  one  dimensional,  linear 
convection  diffusion  equation 


dtu  +  cdxu  —  ad^u  =  0  in  (0, T)  x  (0, 2 7r) , 
u(t  —  0,  x)  =  sin(a:),  on  (0,  2  7t), 


where  c  and  a  >  0  are  both  constants;  periodic  boundary  conditions  are  used.  The 
exact  solution  is  u(t,x)  —  c~at  sin(x  —  ct).  We  compute  the  solution  up  to  T  =  2, 
and  use  the  LDG  method  with  C  defined  by 


/H 

c=\k  o'  )  '  (623) 

We  notice  that,  for  this  choice  of  fluxes,  the  approximation  to  the  convective  term 
cux  is  the  standard  upwinding,  and  that  the  approximation  to  the  diffusion  term 
a  d\  u  is  the  standard  three  point  central  difference,  for  the  P°  case.  On  the  other 
hand,  if  one  uses  a  central  flux  corresponding  to  C12  =  —  C21  =  0,  one  gets  a  spread- 
out  five  point  central  difference  approximation  to  the  diffusion  term  a  d%  u. 

The  LDG  methods  based  on  Pk,  with  k  =  1,2, 3, 4  are  tested.  Elements  with 
equal  size  are  used.  Time  discretization  is  by  the  third-order  accurate  TVD  Runge- 
Kutta  method  [81],  with  a  sufficiently  small  time  step  so  that  error  in  time  is 
negligible  comparing  with  spatial  errors.  We  list  the  Loo  errors  and  numerical  orders 
of  accuracy,  for  Uh,  as  well  as  for  its  derivatives  suitably  scaled  Axmd™  uu  for 
1  <  m  <  fe,  at  the  center  of  of  each  element.  This  gives  the  complete  description  of 
the  error  for  Uh  over  the  whole  domain,  as  uu  in  each  element  is  a  polynomial  of 
degree  k.  We  also  list  the  Loo  errors  and  numerical  orders  of  accuracy  for  qu  at  the 
element  center. 

In  all  the  convection-diffusion  runs  with  a  >  0,  accuracy  of  at  least  (k  +  l)-th 
order  is  obtained,  for  both  Uh  and  qh ,  when  Pk  elements  are  used.  See  Tables  1 
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to  3.  The  P4  case  for  the  purely  convection  equation  a  =  0  seems  to  be  not  in 
the  asymptotic  regime  yet  with  N  —  40  elements  (further  refinement  with  N  =  80 
suffers  from  round-off  effects  due  to  our  choice  of  non-orthogonal  basis  functions) , 
Table  4.  However,  the  absolute  values  of  the  errors  are  comparable  with  the  con¬ 
vection  dominated  case  in  Table  3. 

Finally,  to  show  that  the  order  of  accuracy  could  really  degenerate  to  k  for  Pk , 
as  was  already  observed  in  [5] ,  we  rerun  the  heat  equation  case  a  =  1 ,  c  =  0  with 
the  central  flux 


C  = 


This  time  we  can  see  that  the  global  order  of  accuracy  in  Loo  is  only  k  when 
Pk  is  used  with  an  odd  value  of  k. 
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Table  1 

The  heat  equation  a  =  1,  c  =  0.  Loo  errors  and  numerical  order  of  accuracy, 
measured  at  the  center  of  each  element,  for  Axmd™  Uh  for  0  <  m  <  k,  and  for  qu . 


k 

variable 

N=  10 

error 

N  =  20 

error  order 

AT  =  40 

error  order 

m 

u 

4.55E-4 

5.79E-5 

2.97 

7.27E-6 

Axdxii 

9.01E-3 

2.22E-3 

2.02 

5.56E-4 

m 

q 

4.17E-5 

2.48E-6 

4.07 

1.53E-7 

4.02 

u 

1.43E-4 

1.76E-5 

3.02 

2.19E-6 

3.01 

2 

Axdxu 

7.87E-4 

1.03E-4 

2.93 

1.31E-5 

2.98 

(Ax)2  d\u 

1.64E-3 

2.09E-4 

2.98 

2.62E-5 

2.99 

9 

1.42E-4 

1.76E-5 

3.01 

2.19E-6 

3.01 

u 

1.54E-5 

9.66E-7 

4.00 

6.11E-8 

3.98 

Axdxu 

3.77E-5 

2.36E-6 

3.99 

1.47E-7 

4.00 

3 

(Ax)2  dlu 

1.90E-4 

1.17E-5 

4.02 

7.34E-7 

3.99 

(Ax)3  dlu 

2.51E-4 

1.56E-5 

4.00 

9.80E-7 

4.00 

q 

1.48E-5 

9.66E-7 

3.93 

6.11E-8 

3.98 

U 

2.02E-7 

5.51E-9 

5.20 

1.63E-10 

5.07 

Axdxu 

1.65E-6 

5.14E-8 

5.00 

1.61E-9 

5.00 

4 

(Ax)2  dlu 

6.34E-6 

2.04E-7 

4.96 

6.40E-9 

4.99 

(Ax)3.dlu 

2.92E-5 

9.47E-7 

4.95 

2.99E-8 

4.99 

(Ax)4  dlu 

3.03E-5 

9.55E-7 

4.98 

2.99E-8 

5.00 

q 

2.10E-7 

5.51E-9 

5.25 

1.63E-10 

5.07 
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Table  2 

The  convection  diffusion  equation  a  =  1,  c—1.  Loo  errors  and  numerical  order  of 
accuracy,  measured  at  the  center  of  each  element,  for  Axmd™  uh  for  0  <  m  <  k, 
and  for  qn- 


k 

variable 

N=  10 

error 

AT  =  20 

error  order 

N  =  40 

error  order 

u 

6.47E-4 

1.25E-4 

2.37 

1.59E-5 

2.97 

1 

Axdxu 

9.61E-3 

2.24E-3 

2.10 

5.56E-4 

2.01 

q 

2.96E-3 

1.20E-4 

4.63 

1.47E-5 

3.02 

u 

1.42E-4 

1.76E-5 

3.02 

2.18E-6 

3.01 

2 

Axdxu 

7.93E-4 

1.04E-4 

2.93 

1.31E-5 

2.99 

( Ax)2d2xu 

1.61E-3 

2.09E-4 

2.94 

2.62E-5 

3.00 

q 

1.26E-4 

1.63E-5 

2.94 

2.12E-6 

2.95 

U 

1.53E-5 

9.75E-7 

3.98 

6.12E-8 

3.99 

Axdxu 

3.84E-5 

2.34E-6 

4.04 

1.47E-7 

3.99 

3 

(Ax)2  dlu 

1.89E-4 

1.18E-5 

4.00 

7.36E-7 

4.00 

(. Ax)3  dlu 

2.52E-4 

1.56E-5 

4.01 

9.81E-7 

3.99 

q 

1.57E-5 

9.93E-7 

3.98 

6.17E-8 

4.01 

U 

2.04E-7 

5.50E-9 

5.22 

1.64E-10 

5.07 

Axdxu 

1.68E-6 

5.19E-8 

5.01 

1.61E-9 

5.01 

4 

(Ax)2  dlu 

6.36E-6 

2.05E-7 

4.96 

6.42E-8 

5.00 

(Ax)3  dlu 

2.99E-5 

9.57E-7 

4.97 

2.99E-8 

5.00 

(Ax)*  d^u 

2.94E-5 

9.55E-7 

4.95 

3.00E-8 

4.99 

q 

1.96E-7 

5.35E-9 

5.19 

1.61E-10 

5.06 
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Table  3 

The  convection  dominated  convection  diffusion  equation  a  =  0.01,  c  =  1.  Lm  e 
and  numerical  order  of  accuracy,  measured  at  the  center  of  each  element, 
for  AxTnd™  Uh  for  0  <  m  <  k,  and  for  qn. 


variable 

N=  10 

error 

N  =  '< 

error 

10 

order 

N  = 

error 

u 

7.14E-3 

9.30E-4 

2.94 

1.17E-4 

Axdxu 

6.04E-2 

1.58E-2 

1.93 

4.02E-3 

Q 

8.68E-4 

1.09E-4 

3.00 

1.31E-5 

u 

9.59E-4 

1.25E-4 

2.94 

1.58E-5 

Ax  dxu 

5.88E-3 

7.55E-4 

2.96 

9.47E-5 

(Ax)2  d2u 

1.20E-2 

1.50E-3 

3.00 

1.90E-4 

g 

8.99E-5 

1.11E-5 

3.01 

1.10E-6 

U 

1.11E-4 

7.07E-6 

3.97 

4.43E-7 

AxdxU 

2.52E-4 

1.71E-5 

3.88 

1.07E-6 

(Ax)2  dlu 

1.37E-3 

8.54E-5 

4.00 

5.33E-6 

(Ax)3d3u 

1.75E-3 

1.13E-4 

3.95 

7.11E-6 

g 

1.18E-5 

7.28E-7 

4.02 

4.75E-8 

U 

1.85E-6 

4.02E-8 

5.53 

1.19E-9 

Axdxu 

1.29E-5 

3.76E-7 

5.10 

1.16E-8 

(Ax)2  d2u 

5.19E-5 

1.48E-6 

5.13 

4.65E-8 

(Ax)3  d3u 

2.21E-4 

6.93E-6 

4.99 

2.17E-7 

(AxYdlu 

2.25E-4 

6.89E-6 

5.03 

2.17E-7 

g 

3.58E-7 

3.06E-9 

6.87 

5.05E-11 
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Table  4 

The  convection  equation  a  =  0,  c  =  1.  L errors  and  numerical  order  of  accuracy, 
measured  at  the  center  of  each  element,  for  Axmd™  uu  for  0  <  m  <  k. 


k 

variable 

N=  10 

s? 

II 

to 

o 

% 

II 

o 

error 

error 

order 

error 

order 

u 

7.24E-3 

9.46E-4 

2.94 

1.20E-4 

2.98 

m 

Axdxu 

6.09E-2 

1.60E-2 

1.92 

4.09E-3 

1.97 

u 

9.96E-4 

1.28E-4 

1.61E-5 

2.99 

2 

Axdxu 

6.00E-3 

7.71E-4 

9.67E-5 

3.00 

(Ax)2  dlu 

1.23E-2 

1.54E-3 

3.00 

1.94E-4 

2.99 

u 

1.26E-4 

7.50E-6 

4.07 

4.54E-7 

4.05 

3 

Axdxu 

1.63E-4 

2.00E-5 

3.03 

1.07E-6 

4.21 

(Axfd2xu 

1.52E-3 

9.03E-5 

4.07 

5.45E-6 

4.05 

(Ax)3  dlu 

1.35E-3 

1.24E-4 

3.45 

7.19E-6 

4.10 

u 

3.55E-6 

8.59E-8 

5.37 

3.28E-10 

8.03 

Axdxu 

1.89E-5 

1.27E-7 

7.22 

1.54E-8 

3.05 

4 

(Ax)2  dlu 

8.49E-5 

2.28E-6 

5.22 

2.33E-8 

6.61 

(Ax)3  dlu 

2.36E-4 

5.77E-6 

5.36 

2.34E-7 

4.62 

(Ax)4  dlu 

2.80E-4 

8.93E-6 

4.97 

1.70E-7 

5.72 
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Table  5 

The  heat  equation  a  =  1,  c  =  0.  L<x>  errors  and  numerical  order  of  accuracy, 
measured  at  the  center  of  each  element,  for  Axmd ™  Uh  for  0  <  m  <  k, 
and  for  q ,  using  the  central  flux. 


k 

variable 

N=  10 

error 

N=  20 

error  order 

N  =  40 

error  order 

m 

u 

3.59E-3 

8.92E-4 

2.01 

2.25E-4 

n 

Axdxu 

2.10E-2 

1.06E-2 

0.98 

5.31E-3 

Km 

m 

Q 

2.39E-3 

6.19E-4 

1.95 

1.56E-4 

1.99 

u 

6.91E-5 

4.12E-6 

4.07 

2.57E-7 

4.00 

2 

Axdxu 

7.66E-4 

1.03E-4 

2.90 

1.30E-5 

2.98 

(Ax)2  dlu 

2.98E-4 

1.68E-5 

4.15 

1.03E-6 

4.02 

Q 

6.52E-5 

4.11E-6 

3.99 

2.57E-7 

4.00 

u 

1.62E-5 

1.01E-6 

4.00 

6.41E-8 

3.98 

Axdxu 

1.06E-4 

1.32E-5 

3.01 

1.64E-6 

3.00 

3 

(Ax)2  d^u 

1.99E-4 

1.22E-5 

4.03 

7.70E-7 

3.99 

(Ax)3  dlu 

6.81E-4 

8.68E-5 

2.97 

1.09E-5 

2.99 

Q 

1.54E-5 

1.01E-6 

3.93 

6.41E-8 

3.98 

u 

8.25E-8 

1.31E-9 

5.97 

2.11E-11 

Ax  dxu 

1.62E-6 

5.12E-8 

4.98 

1.60E-9 

4 

(Ax)2  dlu 

1.61E-6 

2.41E-8 

6.06 

3.78E-10 

6.00 

(Ax)3  dlu 

2.90E-5 

9.46E-7 

4.94 

2.99E-8 

4.99 

(Ax)4  dlu 

5.23E-6 

7.59E-8 

6.11 

1.18E-9 

6.01 

<1 

7.85E-8 

1.31E-9 

5.90 

2.11E-11 

5.96 
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6.4  The  LDG  methods  for  the  multidimensional  case 

In  this  section,  we  consider  the  LDG  methods  for  the  following  convection- 
diffusion  model  problem 


dt  U+  dXi  ( fi{u )  -  ^2  an(u)  d*j  u)  =  0 

in  Q , 

(6.24) 

1  <i<d  1  <j<d 

u(t  =  0)  =  no 

on  (0,l)d, 

(6.25) 

where  Q  =  (0,T)  x  (0,  l)d,  with  periodic  boundary  conditions.  Essentially,  the 
one-dimensional  case  and  the  multidimensional  case  can  be  studied  in  exactly 
the  same  way.  However,  there  are  two  important  differences  that  deserve 
explicit  discussion.  The  first  is  the  treatment  of  the  matrix  of  entries  Ojj(u), 
which  is  assumed  to  be  symmetric,  semipositive  definite  and  the  introduction 
of  the  variables  qe,  and  the  second  is  the  treatment  of  arbitrary  meshes. 

To  define  the  LDG  method,  we  first  notice  that,  since  the  matrix  ajj(u)  is 
assumed  to  be  symmetric  and  semipositive  definite,  there  exists  a  symmetric 


matrix  bij(u)  such  that 

Mu)  =  Ei <i<d  M«)  btj(u).  (6.26) 

Then  we  define  the  new  scalar  variables  qt  =  EicjCd  b(j(u)  dx.  u  and  rewrite 
the  problem  (6.24),  (6.25)  as  follows: 

dtu+  M/«(«)-  E  M«)«)= 0  in<?>  (6-27) 

1 <i<d  1 <l<d 

qt  ~  E  dxi  &>•(«)  -  0  1  =  !’  •  •  -d>  in  Q,  (6-28) 

1  <j<d 

u(t  =  0)  =  u0  on  (0,  l)d,  (6.29) 


where  gtj(u)  =  f“  bgj(s )  ds.  The  LDG  method  is  now  obtained  by  discretiz¬ 
ing  the  above  equations  by  the  Discontinuous  Galerkin  method. 

We  follow  what  was  done  in  §2.  So,  we  set  w  =  (it,  q)'  =  (it,  qi ,  •  •  ■  ,  qa)1 
and,  for  each  i  =  1,  •  •  •  ,  d,  introduce  the  flux 

Mw)  =  ( Mu)  ~  Ei <i<d  bu(u)  <U>  ~9u(u),  •  •  ■ ,  - 9di(u ) )  .  (6.30) 

We  consider  triangulations  of  (0,  l)d,  Tax  =  {  K},  made  of  non-overlapping 
polyhedra.  We  require  that  for  any  two  elements  K  and  K',  K  P\K  is  either 
a  face  e  of  both  K  and  K'  with  nonzero  (d  -  1)-Lebesgue  measure  |  e  |,  or  has 
Hausdorff  dimension  less  than  d  —  1.  We  denote  by  £ax  the  set  of  all  faces  e 
of  the  border  of  K  for  all  K  €  Tax-  The  diameter  of  K  is  denoted  by  Axk 
and  the  maximum  Axk  ,  for  K  £  Tax  is  denoted  by  Ax.  We  require,  for  the 
sake  of  simplicity,  that  the  triangulations  Tax  be  regular,  that  is,  there  is  a 
constant  independent  of  Ax  such  that 

—  <<7  VKeTAx, 


PK 
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where  px  denotes  the  diameter  of  the  maximum  ball  included  in  K. 

We  seek  an  approximation  w*  =  (u*,  q*)*  =  (uh,qh  1 ,  ■  •  •  ,  qhdY  to  w  such 
that  for  each  time  t  G  [0,T],  each  of  the  components  of  wh  belong  to  the  finite 
element  space 

vh  =  v£  =  [v  e  i1((0,l)d)  :  v\k  6  Pk(K)VKe  Tax},.  (6.31) 

where  Pk(K)  denotes  the  space  of  polynomials  of  total  degree  at  most  k.  In 
order  to  determine  the  approximate  solution  w/,,  we  proceed  exactly  as  in 
the  one-dimensional  case.  This  time,  however,  the  integrals  are  made  on  each 
element  K  of  the  triangulation  Tax-  We  obtain  the  following  weak  formulation 
on  each  element  K  of  the  triangulation  Tax  ■ 

V  vhiU  e  Pk(K)  : 

/  dtUh(x,t)vh,u(x)dx  -  T  /  hiu(wh(x,t))dX{vh'U(x)dx 

J*  1  <i<djK 

+  hu(wh,ndK)(x,t)vh,u(x)dr(x)  =  0,  (6.32) 

JdK 


for  i  —  1,  ■  •  •  ,d  : 
Vvh,uePk(K): 

/  qhe(x,t)  vh>qt  (x)dx 
Jk 


5Z  /  hi  it  i™h(x,  t))  dXj  vhM {x)  dx 
1  <i<dJK 


+  [  hqe(-wh,ndK)(x,t)vhiqc(x)dr(x)  =  0,  (6.33) 

JdK 


V  vhiU  G  Pk(K)  : 

/  Uh(x,0)  vhji(x)  dx  =  /  uo(x)vh,i{x)dx,  (6.34) 

Jk  Jk 

where  n ok  denotes  the  outward  unit  normal  to  the  element  K  at  £  G  dK.  It 
remains  to  choose  the  numerical  flux  ( hu ,  hqi ,  •  •  ■  ,  hqdY  =  h  =  h(w/,,  n dK)(x,  t). 

As  in  the  one-dimensional  case,  we  require  that  the  fluxes  h  be  of  the 
form 

h(wft,n9K-)(x)  =  h{wh{xintK  ,t),wh{xextK  ,t);n9K), 

where  w h{xmtK )  is  the  limit  at  x  taken  from  the  interior  of  K  and  'W/l(xextK ) 
the  limit  at  x  from  the  exterior  of  K,  and  consider  fluxes  that: 

(i)  Are  locally  Lipschitz,  conservative,  that  is, 

h(wh  {xintK ) ,  wfc  (*“** ) ;  ns^)h(wh  ( xext - ) ,  wfc  ( xint« );  -naK )  =  0, 


and  consistent  with  the  flux 
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(ii)  Allow  for  a  local  resolution  of  each  component  of  in  terms  of  uh  only, 

(iii)  Reduce  to  an  E-flux  when  a(-)  =  0, 

(iv)  Enforce  the  Instability  of  the  method. 

Again,  we  write  our  numerical  flux  as  the  sum  of  a  convective  flux  and  a 
diffusive  flux: 


h  —  hcont)  T  h dif  f  •> 
where  the  convective  flux  is  given  by 

hCOm)(w“,w+;n)  =  (/(u~,u+;n),0)*, 

where  f{u~,u+]  n)  is  any  locally  Lipschitz  E-flux  which  is  conservative  and 
consistent  with  the  nonlinearity 

X  fi(U)ni . 

l<t<d 


and  the  diffusive  flux  hdiff( w  ,w+;n)  is  given  by 
y  [9u(u)} 

1  <i,e<d  1  <i<d 

where 

(  0  Ci2  C13  •  •  •  Cld\ 


qerii,-  X]  9n(u)nir-- ,  -  X  9id(u)  n*  )*  -  Cdiff  [w], 


l<i<d 


Cdif  f  — 


-Ci2  0  0 

-Ci3  0  0 


Cid  0  0  •••  0  / 

Cij  =  Cij(yr~,  w+)  is  locally  Lipschitz  for  j  =  1,  ■  •  ■  ,d, 
cij  =  0  when  a(- )  =0  for  j  =  1,  •  •  •  ,d. 

We  claim  that  this  flux  satisfies  the  properties  (i)  to  (iv). 

To  prove  that  properties  (i)  to  (iii)  are  satisfied  is  now  a  simple  exercise. 
To  see  that  the  property  (iv)  is  satisfied,  we  first  rewrite  the  flux  h  in  the 
following  way: 


(-  E 

1  <i,e<d 
where 


[jHlW)}  — 

M 


qerii,  -  X  9n(n)rH 

1  <i<d 

(  Cn  Ci2  Ci3 
-C12  0  0 

C  =  I  _ci3  o  o 


X  9id{u)rii )  —  C  [w], 


l<i<d 


\-cld  0  0 

1 0.(») 

1  <i<d  [u] 


Cl  d\ 
0 
0 

0  J 


cn  =  jh  (  Ei<i<<i  LWi  ni  ~  /(«  > w+; n) ) » 
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where  <j>i(u)  =  /“  fi(s )  ds.  Since  /(•,  -;n)  is  an  E-flux, 

j  ru+  _ 

Cu  =  7— : |2  /  (  X)  /i(s)ni  - /(n_,u+;n)) 

LUJ  l<i<d 


ds 


>0, 

and  so  the  matrix  C  is  semipositive  definite.  The  property  (iv)  follows  from 
this  fact  and  from  the  following  result. 


Proposition  28.  (Stability)  We  have, 

\  f  u2h(x,T)dx  +  f  [  \qh(x,t)\2  dxdt  +  0T,c([wh\) 

*  J(0,l)d  Jo  j(0,l)d 

<  \  [  u20(x)  dx> 

1  j{  0,1)“* 

where  the  quantity  @x,c([w/i])  given  by 
T 

[  X]  [  \^h(x,t)]tC[vfh{x,t)]dr(x)dt. 

e€SA*  Je 


We  can  also  prove  the  following  error  estimate.  We  denote  the  integral  over 
(0,  l)d  of  the  sum  of  the  squares  of  all  the  derivatives  of  order  (k  +  1)  of  u  by 


Theorem  29.  (Error  estimate)  Let  e  be  the  approximation  error  w  -  wh. 
Then  we  have,  for  arbitrary,  regular  grids, 


|  eu(x,T)  |2  dx  -I- 


}j 

Jo  J{ o, 
<  C(Ax) 


i)d 

k 


|  eq(x,t)  |2  dxdt  +  @T,c([e]) 


1/2 


where  C  =  C(k,  \  u  Ijt+i ,  |  u  |fe+2)-  In  the  purely  hyperbolic  case  aij  =  0,  the 
constant  C  is  of  order  (Ax)1/2..  In  the  purely  parabolic  case  c  =  0,  the  constant 
C  is  of  order  Ax  for  even  values  of  k  and  of  order  1  otherwise  for  Cartesian 
products  of  uniform  grids  and  for  C  identically  zero  provided  that  the  local 
spaces  Qk  are  used  instead  of  the  spaces  Pk,  where  Qk  is  the  space  of  tensor 
products  of  one  dimensional  polynomials  of  degree  k. 


6.5  Extension  to  multidimensional  systems 

In  this  chapter,  we  have  considered  the  so-called  LDG  methods  for  convection- 
diffusion  problems.  For  scalar  problems  in  multidimensions,  we  have  shown 
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that  they  are  L2-stable  and  that  in  the  linear  case,  they  are  of  order  k  if 
polynomials  of  order  k  are  used.  We  have  also  shown  that  this  estimate  is 
sharp  and  have  displayed  the  strong  dependence  of  the  order  of  convergence 
of  the  LDG  methods  on  the  choice  of  the  numerical  fluxes. 

The  main  advantage  of  these  methods  is  their  extremely  high  paralleliz- 
ability  and  their  high-order  accuracy  which  render  them  suitable  for  computa¬ 
tions  of  convection-dominated  flows.  Indeed,  although  the  LDG  method  have 
a  large  amount  of  degrees  of  freedom  per  element,  and  hence  more  compu¬ 
tations  per  element  are  necessary,  its  extremely  local  domain  of  dependency 
allows  a  very  efficient  parallelization  that  by  far  compensates  for  the  extra 
amount  of  local  computations. 

The  LDG  methods  for  multidimensional  systems,  like  for  example  the 
compressible  Navier-Stokes  equations  and  the  equations  of  the  hydrodynamic 
model  for  semiconductor  device  simulation,  can  be  easily  defined  by  simply 
applying  the  procedure  described  for  the  multidimensional  scalar  case  to  each 
component  of  u.  In  practice,  especially  for  viscous  terms  which  are  not  sym¬ 
metric  but  still  semipositive  definite,  such  as  for  the  compressible  Navier- 
Stokes  equations,  we  can  use  q  =  (dXl  u,  ...,dXd  u)  as  the  auxiliary  variables. 
Although  with  this  choice,  the  L2 -stability  result  will  not  be  available  theo¬ 
retically,  this  would  not  cause  any  problem  in  practical  implementations. 


6.6  Some  numerical  results 

Next,  we  present  some  numerical  results  from  the  papers  by  Bassi  and  Rebay 
[3]  and  Lomtev  and  Karniadakis  [63]. 

•  Smooth,  steady  state  solutions.  We  start  by  displaying  the  conver¬ 
gence  of  the  method  for  a  p-refinement  done  by  Lomtev  and  Karniadakis  [63]. 
In  Figure  6.1,  we  can  see  how  the  maximum  errors  in  density,  momentum, 
and  energy  decrease  exponentially  to  zero  as  the  degree  k  of  the  approximat¬ 
ing  polynomials  increases  while  the  grid  is  kept  fixed;  details  about  the  exact 
solution  can  be  found  in  [63]. 

Now,  let  us  consider  the  laminar,  transonic  flow  around  the  NACA0012 
airfoil  at  an  angle  of  attack  of  ten  degrees,  free  stream  Mach  number  M  = 
0.8,  and  Reynolds  number  (based  on  the  free  stream  velocity  and  the  airfoil 
chord)  equal  to  73;  the  wall  temperature  is  set  equal  to  the  free  stream 
total  temperature.  Bassy  and  Rebay  [3]  have  computed  the  solution  of  this 
problem  with  polynomials  of  degree  1, 2,  and  3  and  Lomtev  and  Karniadakis 
[63]  have  tried  the  same  test  problem  with  polynomials  of  degree  2, 4,  and  6 
in  a  mesh  of  592  elements  which  is  about  four  times  less  elements  than  the 
mesh  used  by  Bassi  and  Rebay  [3].  In  Figure  6.3,  taken  from  [63],  we  display 
the  pressure  and  drag  coefficient  distributions  computed  by  Bassi  and  Rebay 
[3]  with  polynomials  on  degree  3  and  the  ones  computed  by  Lomtev  and 
Karniadakis  [63]  computed  with  polynomials  of  degree  6.  We  can  see  good 
agreement  of  both  computations.  In  Figure  6.2,  taken  from  [63],  we  see  the 
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mesh  and  the  Mach  isolines  obtained  with  polynomials  of  degree  two  and 
four;  note  the  improvement  of  the  solution. 

Next,  we  show  a  result  from  the  paper  by  Bassi  and  Rebay  [3].  We  con¬ 
sider  the  laminar,  subsonic  flow  around  the  NACA0012  airfoil  at  an  angle 
of  attack  of  zero  degrees,  free  stream  Mach  number  M  =  0.5,  and  Reynolds 
number  equal  to  5000.  In  figure  6.4,  we  can  see  the  Mach  isolines  correspond¬ 
ing  to  linear,  quadratic,  and  cubic  elements.  In  the  figures  6.5,  6.6,  and  6.7 
details  of  the  results  with  cubic  elements  are  shown.  Note  how  the  boundary 
layer  is  captured  within  a  few  layers  of  elements  and  how  its  separation  at 
the  trailing  edge  of  the  airfoil  has  been  clearly  resolved.  Bassi  and  Rebay  [3] 
report  that  these  results  are  comparable  to  common  structured  and  unstruc¬ 
tured  finite  volume  methods  on  much  finer  grids-  a  result  consistent  with  the 
computational  results  we  have  displayed  in  these  notes. 

Finally,  we  present  a  not-yet-published  result  kindly  provided  by  Lomtev 
and  Karniadakis  about  the  simulation  of  an  expansion  pipe  flow.  The  smaller 
cylinder  has  a  diameter  of  1  and  the  larger  cylinder  has  a  diameter  of  2.  In 
Figure  6.8,  we  display  the  velocity  profile  and  some  streamlines  for  a  Reynolds 
number  equal  to  50  and  Mach  number  0.2.  The  computation  was  made  with 
polynomials  of  degree  5  and  a  mesh  of  600  tetrahedra;  of  course  the  tetrahe- 
dra  have  curved  faces  to  accommodate  the  exact  boundaries.  In  Figure  6.9, 
we  display  a  comparison  between  computational  and  experimental  results.  As 
a  function  of  the  Reynolds  number,  two  quantities  are  plotted.  The  first  is  the 
distance  between  the  step  and  the  center  of  the  vertex  (lower  branch)  and  the 
second  is  the  distance  from  the  step  to  the  separation  point  (upper  branch). 
The  computational  results  are  obtained  by  the  method  under  consideration 
with  polynomials  of  degree  5  for  the  compressible  Navier  Stokes  equations, 
and  by  a  standard  Galerkin  formulation  in  terms  of  velocity-pressure  (NEK- 
TAR),  by  Sherwin  and  Karniadakis  [79],  or  in  terms  of  velocity- vorticity 
(IVVA),  by  Trujillo  [87],  for  the  incompressible  Navier  Stokes  equations;  re¬ 
sults  produced  by  the  code  called  PRISM  are  also  included,  see  Newmann 
[69].  The  experimental  data  was  taken  from  Macagno  and  Hung  [67].  The 
agreement  between  computations  and  experiments  is  remarkable. 

•  Unsteady  solutions.  To  end  this  chapter,  we  present  the  computation 
of  an  unsteady  solution  by  Lomtev  and  Karniadakis  [63].  The  test  problem 
is  the  classical  problem  of  a  flow  around  a  cylinder  in  two  space  dimensions. 
The  Reynolds  number  is  10, 000  and  the  Mach  number  0.2. 

In  Figure  6.10,  the  streamlines  are  shown  for  a  computation  made  on  a 
grid  of  680  triangles  (with  curved  sides  fitting  the  cylinder)  and  polynomials 
whose  degree  could  vary  from  element  to  element;  the  maximum  degree  was 
5.  In  Figure  6.11,  details  of  the  mesh  and  the  density  around  the  cylinder  are 
shown.  Note  how  the  method  is  able  to  capture  the  shear  layer  instability 
observed  experimentally.  For  more  details,  see  [63]. 
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Fig.  6.1.  Maximum  errors  of  the  density  (triangles),  momemtum  (circles)  and  en¬ 
ergy  (squares)  as  a  function  of  the  degree  of  the  approximating  polynomial  plus 
one  (called  “number  of  modes”  in  the  picture). 


192  Bernardo  Ci 


Fig.  6.2.  Mesh  (top 
73,  M  =  0.8,  angle  c 
(bottom)  elements. 


if 


nes  around  the  NACA0012  airfoil,  (Re  = 
legrees)  for  quadratic  (middle)  and  quartic 


Fig.  6.3.  Pressure  (top)  and  drag(bottom)  coefficient  distributions.  The  squares 
were  obtained  by  Bassi  and  R,ebay  [3]  with  cubics  and  the  crosses  by  Lomtev  and 
Karniadakis  [63]  with  polynomials  of  degree  6. 
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Fig.  6.5.  Pressure  isolines  around  the  NACA0012  airfoil,  (Re  =  5000,  M  =  0.5, 
zero  angle  of  attack)  for  the  for  cubic  elements  without  (top)  and  with  (bottom) 
the  corresponding  grid. 
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Fig.  6.6.  Mach  isolines  around  the  leading  edge  of  the  NACA0012  airfoil,  (Re  = 
5000,  M  =  0.5,  zero  angle  of  attack)  for  the  for  cubic  elements  without  (top)  and 
with  (bottom)  the  corresponding  grid. 


Fig.  6.7.  Mach  isolines  around  the  trailing  edge  of  the  NACA0012  airfoil,  (Re  = 
5000,  M  =  0.5,  zero  angle  of  attack)  for  the  for  cubic  elements  without  (top)  and 
with  (bottom)  the  corresponding  grid. 
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Fig.  6.8.  Expansion  pipe  flow  at  Reynolds  number  50  and  Mach  number  0.2.  Veloc¬ 
ity  profile  and  streamlines  computed  with  a  mesh  of  600  elements  and  polynomials 
of  degree  5. 
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Fig.  6.9.  Expansion  pipe  flow:  Comparison  between  computational  and  experimen¬ 
tal  results. 


Fig.  6.11.  Flow  axound  a  cylinder  with  Reynolds  number  10, 000  and  Mach 
0.2.  Detail  of  the  mesh  (top)  and  density  (bottom)  around  the  cylinder. 
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6.7  Appendix:  Proof  of  the  L2-error  estimates 

Proof  of  Proposition  26  In  this  section,  we  prove  the  the  nonlinear 
stability  result  of  Proposition  26.  To  do  that,  we  first  show  how  to  obtain 
the  corresponding  stability  result  for  the  exact  solution  and  then  mimic  the 
argument  to  obtain  Proposition  26. 

The  continuous  case  as  a  model.  We  start  by  rewriting  the  equations 
(6.10)  and  (6.11),  in  compact  form.  If  in  equations  (6.10)  and  (6.11)  we  re¬ 
place  vu(x)  and  vq(x)  by  vu(x,t)  and  vq(x,t),  respectively,  add  the  resulting 
equations,  sum  on  j  from  1  to  N,  and  integrate  in  time  from  0  to  T,  we 
obtain  that 


B( w,  v)  =0,  V  smooth  v, 


where 


B{  w,v)  =  /  /  dtu(x,t)vu(x,t)dxdt 

Jo  Jo 

+  /  /  q(x,t)vg(x,t)  dxdt 

Jo  Jo 


-  [  f  h(w(x,t))t  dx  v(x,t)  dxdt. 
Jo  Jo 


Note  that  if  we  use  the  fact  that 


(6.35) 


(6.36) 


h(w(a:, t))1  dxw(x, t)  =  dx ( <f>(u)  -g{u)q) 


is  a  complete  derivative,  we  see  that 

1  f1  fT  f 1 

J3(w,w)  =  -  /  u2(x,T)  dx  +  /  /  q2(x,t)dxdt 

2  Jo  Jo  Jo 

~\j  uo(x)  dx, 


(6.37) 


and  that  B(w,  w)  =  0,  by  (6.35).  As  a  consequence,  we  immediately  obtain 
the  following  L2 -stability  result: 


dxdt  =  |  f*  Uq(x)  dx. 


This  is  the  argument  we  have  to  mimic  in  order  to  prove  Proposition  26. 

The  discrete  case.  Thus,  we  start  by  finding  a  compact  form  of  equations 
(6.13)  and  (6.14).  If  we  replace  Vh,u(x)  and  Vh,g(x)  by  Vh,u(x,t)  and  Vh,g(x,t) 
in  the  equations  (6.13)  and  (6.14),  add  them  up,  sum  on  j  from  1  to  A  and 
integrate  in  time  from  0  to  T,  we  obtain 


^(wh.Vft)  =  0, 

vV/i(f)  e  p*  x  vhk,  vte(o.r). 


(6.38) 
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where 


Bh.(wh,vh)=  /  /  dtuh{x,t)vh>u{x,t)dxdt 
Jo  Jo 

+  /  qh(x,t)vh,q(x,t)dxdt 

Jo  Jo 

rT 

-  Y  ^(W^)i+l/2WIV/>(*)]i+l/2 

J  0  ■ts'Ac'M 


dt 


(6.39) 


l  <j<N 


Y  h(wfc(*  ,  i))4  dx  Vh{x,t)  dxdt. 


1  <j<N 

Next,  we  obtain  an  expression  for  5/i(w/,,  wh).  It  is  contained  in  the  following 
result. 

Lemma  30.  We  have 

1  f 1 

Bh(wh,wh)  =  -  J  u2h(x,T)dx 

+  [  [  q2h{x,t)dxdt  +  0T,c([wh]) 

Jo  Jo 

1  fl 

~2  J0  u^x^dx' 

where  @r,c([w/i])  is  defined  in  Proposition  26. 

Next,  since  Bh.(vfh,'Wh)  =  0,  by  (6.38),  we  get  the  equality 
\  u\{x,T)dx  +  ql(x,t)dxdt  +  GT,c{bvh])  =  ~  u2h(x,0)dx 
from  which  Proposition  26  easily  follows  since 

\  fo  UKX> 0)dx<±  fj  u$(x)  dx , 

by  (6.12).  It  remains  to  prove  Lemma  30. 

Proof  of  Lemma  (30).  After  setting  \h  =  w*  in  (6.39),  we  get 

B{wh,wh)  =  ]-  f  u\(x,T)dx+  [  [  ql(x,t)dxdt 

1  Jo  Jo  Jo 

fT  If1 

+  /  &diss{t)dt  -  -  /  u2h(x,0)dx, 

Jo  *  Jo 

where  &diss(t)  is  given  by 

-  Y  \^(wh)tj+1/2(t)[Jlvh(t)}j+i/2+  h(wh(x,t)Y  dx  wh(x,t)dx  i. 

1  <j<N  ■'Ii  > 
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It  only  remains  to  show  that 

rT 


s 


©diss(t)dt  -  6>T,c([Wft]). 


To  do  that,  we  proceed  as  follows.  Since 

h(wft(x,  t))1  dx  wh(x,  t)b  =  ( f(uh )  -  \/a(uh)  qh  )dxuh-  g(uh)  dx  qh 

rUh 

=  dx  (  /  f(s)  ds  -  g(uh)  qh) 

=  dx  (4>{uh)  -  g(uh)  qh) 

=  dxH(wh(x,t)), 


we  get 


®diss(t)  =  {  [H(wh(t))]j+1/2  -  h(w fc)*-+1/2(t)  [wfc(*).]j+1/2| 

1  <j<N  ^  ' 

=  5]  {[ff(wfc(*))]-h(wfc)*W[wfc(0]} 

1  IV  f  J  1-1 


l<j<N 

Since,  by  the  definition  of  H, 

[H(wh(t))]  =  [  <t>(uh(t))]  -  [ g{uh(t))  qh(t) ] 


1  3+ 1/2 


=  [0(«ft(*))]  “  [9{uh(t))]qh(t)  -  [9i,(l)]s(«/,(t)), 


and  since  ( hu ,  hqY  =  h,  we  get 

©diss(t) 

=  ]T]  -  [9{uh(t))]qh(t)  -  [uh(t)}  hu\ 

1  <j<N  '  J 

l<j<N  *■  J  )' 


3+ 1/2 


1  1+1/2 


This  is  the  crucial  step  to  obtain  the  L2 -stability  of  the  LDG  methods,  since 
the  above  expression  gives  us  key  information  about  the  form  that  the  flux 
h  should  have  in  order  to  make  ©diss(t)  a  nonnegative  quantity  and  hence 
enforce  the  Instability  of  the  LDG  methods.  Thus,  by  taking  h  as  in  (6.16), 
we  get 


© diss(t )  =  El<j<JV  |  [wh(t)Y©  [Wft(t)]  | 


1+1/2 


and  the  result  follows.  This  completes  the  proof. 
This  completes  the  proof  of  Proposition  26. 
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Proof  of  Theorem  27  In  this  section,  we  prove  the  error  estimate  of  The¬ 
orem  27  which  holds  for  the  linear  case  /'(•)  =  c  and  a(-)  =  a.  To  do  that, 
we  first  show  how  to  estimate  the  error  between  the  solutions  w„  =  (u„ ,  qu)* , 
v  —  1, 2,  of 

dt  uv  +  dx  ( f{uv )  -  \/a{uv)  qv)  =0  in  (0,  T)  x  (0, 1), 
qv  -  dx  g{u„)  =  0  in  (0,  T)  x  (0, 1), 
uv(t  =  0)  =  u0,v,  on  (0, 1). 

Then,  we  mimic  the  argument  in  order  to  prove  Theorem  27. 

The  continuous  case  as  a  model.  By  the  definition  of  the  form  B(-,  •), 
(6.36),  we  have,  for  v  =  1, 2, 

B(w„,v)=0,  V  smooth  v(t),  Vte(0,T). 

Since  in  this  case,  the  form  B(-,  •)  is  bilinear,  from  the  above  equation  we 
obtain  the  so-called  error  equation : 


B(e,v)  =0,  V  smooth  v(f),  Vt  €  (OjT), 


where  e  =  wi  —  w2.  Now,  from  (6.37),  we  get  that 


e\(x,T)  dx  +  [[  e2q(x,t)  dx  dt 


and  since  eu(a;,0)  =  uo,i(a;)  -uo,2(z)  and  B(e,e)  =  0,  by  the  error  equation, 
we  immediately  obtain  the  error  estimate  we  sought: 

1  71  fT  f 1  1 

2  J  eu(x,  T)  dx  +  J ^  e2(x,t)dxdt  =  - 

To  prove  Theorem  27,  we  only  need  to  obtain  a  discrete  version  of  this  argu¬ 
ment. 

The  discrete  case.  Since, 


>0,1  (z)  -  Uo.2 (x))2  dx 


Bh{wh,^h)  =  0,  VvA(f)e  VhxVh,  Vte(0,T), 

Bh(w,vh)  =  0,  VVfc(t)  £VhxVh,  Vf  e  (0,T), 

by  (6.38)  and  by  equations  (6.10)  and  (6.11),  respectively,  we  immediately 
obtain  our  error  equation: 

Bh(e,Vh)  =  0,  Vvh(t)  e  Vh  x  Vh,  Vte(0,T), 

where  e  =  w  —  w^.  Now,  according  to  the  continuous  case  argument,  we 
should  consider  next  the  quantity  Bh(e,  e);  however,  since  e  is  not  in  the  finite 
element  space,  it  is  more  convenient  to  consider  B/l(P/l(e),P/l(e)),  where 

Ph(e(t))  =  (Ph(eu(t)),Ph(eq(t))) 
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is  the  so-called  L2 -projection  of  e(t)  into  the  finite  element  space  x  . 
The  L2-projection  of  the  function  p  into  14,  PyL  (p) ,  is  defined  as  the  only 
element  of  the  finite  element  space  14  such  that 

Vvh£Vh:  [  ( Ph(p)(x)  -  p(x) )  vh{x )  dx  =  0.  (6.41) 

Jo 

Note  that,  in  fact  Uh(t  —  0)  =  Ph{u o),  by  (6.15). 

Thus,  by  Lemma  30,  we  have 

Bh(Ph(e),Ph(e))  =  |  Ph(eu(T))(x)  |2  dx 

+  [  f  |  Ph(eq{t))(x)  |2  dx  dt 
Jo  Jo 

+  @T,c{[Ph(e)]) 

1  f1 

--J  \Ph(eu(0)){x)\2  dx, 

and  since 


Ph(eu(0))  =  Ph(u0  -  uh{ 0))  =  Ph(u0)  -  itft(O)  =  0, 
by  (6.15)  and  (6.41),  and 

Bh(Ph(e),Ph(e))  =  Bh{Ph{e) -e,Ph{e))  =  Bh{Ph(w)  -  w, Ph(e)), 
by  the  error  equation,  we  get 

\  [  \Ph(eu(T)){x)\2dx+  [  [  \Ph(eq(t)){x)\2  dxdt  +  0T,c{[Ph(e)]) 

4  Jo  Jo  Jo 

=  Bh(Ph(w)  -  w,Ph{e)).  (6.42) 

Note  that  since  in  our  continuous  model,  the  right-hand  side  is  zero,  we  expect 
the  term  B(Ph( w)  -  w,Ph(e))  to  be  small. 

Estimating  the  right-hand  side.  To  show  that  this  is  so,  we  must 
suitably  treat  the  term  B(Ph{ w)  -  w,P/j(e)). 


Lemma  31.  For  p  =  Ph{ w)  -  w,  we  have 

Bh(p,Ph(e))  =i0T,c(p)  +  \  Jq  Jo  \Ph{eq(t))(x)\2  dxdt 
+  \{Ax)2k  jTTCi(t)dt 


+ 


(Ax)k  £  ^2(t)|^  I  Ph(el 


(t))(x)  f  dx  >  dt , 


1/2 
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where 

Ci(t)  =  2d{{^  Ax  +  4 1 C12  \2d2k)\u(t)\2k+1 

+4  a  ^  (Ax)2  |  «(i)||+1  J, 

C2(t)  =  V&ck  d*  j  s/a  |  C12  |  «(i)  jin-2 

+a(4a:)(fc_*>|u(t)li+2  J. 

where  the  constants  Ck  and  dk  depend  solely  on  k,  and  k  =  k  except  when  the 
grids  are  uniform  and  k  is  even,  in  which  case  k  =  k  +  1 . 

Note  how  cn  appears  in  the  denominator  of  C\ (t).  However,  C\{t)  remains 
bounded  as  cn  goes  to  zero  since  the  convective  numerical  flux  is  an  E-flux. 

To  prove  this  result,  we  will  need  the  following  auxiliary  lemmas.  We 
denote  by  |u|#(m-i)(j)  the  integral  over  J  of  the  square  of  the  ( k  +  l)-the 
derivative  of  u. 

Lemma  32.  For  p  =  P/i(w)  -  w,  we  have 

I P«j+i/2  I  ^  cfc  (Ar)*’4'1/2  |  w  |tf(s+i)(jj+1/2), 

I  [Pu]j+ 1/2  I  <  ck(Ax)k+1'2  \u\H(k+1){J.+1/2), 

\p~qj+1/2\<  CkVa(Ax)k+1/2\u\Ho.+2HJj+1/2), 

I  [Pq  ]j+l/2  I  <  Cfc  y/a(Ax  )fc+1/2  |  U  Ih(,'+2)(J,+1/2), 

where  Jj+1/2  =  Ij  U  /j+i,  the  constant  c*-  depends  solely  on  k,  and  k  =  k 
except  when  the  grids  are  uniform  and  k  is  even,  in  which  case  k  =  k  +  1. 

Proof.  The  two  last  inequalities  follow  from  the  first  two  and  from  the 
fact  that  q  =  \j adxu .  The  two  first  inequalities  with  k  =  k  follow  from  the 
definitions  of  pZ  and  [pu  ]  and  from  the  following  estimate: 

I  Ph(u)(xf+1/2)  -  uj+1/ 2  |  <  ic*  ( Ax)k+1/2  |  u  |flwi)(j<+1/a)l 

where  the  constant  c*,  depends  solely  on  k.  This  inequality  follows  from  the 
fact  that 

Ph(u)(xf+ 1/2)  -  uJ+1/2  =  0 

when  u  is  a  polynomial  of  degree  k  and  from  a  simple  application  of  the 
Bramble-Hilbert  lemma. 
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To  prove  the  inequalities  in  the  case  in  which  k  =  k  +  1,  we  only  need  to 
show  that  if  u  is  a  polynomial  of  degree  k  +  1  for  k  even,  then  pZ  =  0.  It  is 
clear  that  it  is  enough  to  show  this  equality  for  the  particular  choice 

u{x)  =  ((x  -  xj+1/2)/(Ax/2))k+1 . 

To  prove  this,  we  recall  that  if  P(  denotes  the  Legendre  polynomials  of  order 

t 

(i)  Pi(s)  P tn(s)  ds  —  2t+i 

(ii)  Pt(±  1)  =  (±1)*,  and 

(iii)  Pe(s)  is  a  linear  combination  of  odd  (even)  powers  of  s  for  odd  (even) 
values  of  i. 


Since  we  are  assuming  that  the  grid  is  uniform,  Axj  =  Axj+y  =  Ax,  we  can 
write,  by  (i),  that 

Ph{u)(x)=  ^^{/1  pl(s)u(xj  +  \Axs)ds^Pt(^j^), 


for  x  €  Ij.  Hence,  for  our  particular  choice  of  u,  we  have  that  the  value  of 

Pt(8)  ■  {(«  -  l)fc+1  P/(l)  +  (s  +  1)*+1  Pei-l)}  ds 

C  t l)  S\  Pt{s)  ^  P^  +  Pe (-1))  ds 

C  t  0  f\  Pi{s)  P  +  (-1)')  ds ’ 

by  (ii).  When  the  factor  {(-l)fc+1-i  +  (-1)*}  is  different  from  zero,  |  fc  + 1  - 
i  +  1 1  is  even  and  since  k  is  also  even,  |  i  —  t  \  is  odd.  In  this  case,  by  (iii), 

fh  Pe{s)  s'  ds  =  0, 


Puj+ 1/2  is  given  by 

i  ^  u  +  i  r1 
^  2  y_i 


2  2 
o  <e<k 


2£+l 


=  1  y 

2  2 
0  <t,i<k 


=  \  E 


2*  +  l 


9  2 

1  0  <Li<k 


and  so  pZj+ 1/2  =  0-  This  completes  the  proof. 

We  will  also  need  the  following  result  that  follows  from  a  simple  scaling 
argument. 


Lemma  33.  We  have 

I  [ph(p)]j+ 1/2  I  <  dk  (Ax)-1/2  ||  Ph(p)  ||t2 (ji+l/2), 
where  Jj+1/2  —  Ij  U  Ij+i  and  the  constant  dk  depends  solely  on  k. 
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We  are  now  ready  to  prove  Lemma  31. 

Proof  of  Lemma  31.  To  simplify  the  notation,  let  us  set  =  Pke.  By 
the  definition  of  Bk  (•,■)>  we  have 


Bh(p,vh)  =  I  /  dtPu{x,  t)  Vh,u{x,  t)  dx  dt 
Jo  Jo 

+  /  Pq(x,t)Vh,q{x,t)dxdt 

Jo  Jo 


>0  JO 
rT 


-  f  £  k(p)j+i /2  (*)  [  Vft  (i)  ] J-+1/2  dt 
Jo  i<i<W 

~  f  /  h(p(*  ,  t)Y  dxXh{x,t)dxdt 

■'°  i<j<N  •'L 

=  -/  £  ^(P)i+i/aW[vfc(*)]i+i/a<ft. 

•'O  ip-vivt 


l<j<iV 


by  the  definition  of  the  L2-projection  (6.41). 

Now,  recalling  that  p  =  (pu,PqY  and  that  vh  =  (vu,vqy,  we  have 

h(p  )t[v/l(t)]  ={cp^-cn[pu\)[vu) 

+(-Vap;  -  C12[pq])[vu] 

+(-v/apU  +  c1 2  [pu])  [«,] 

=  0i  +  02  +  O3. 


By  Lemmas  32  and  33,  and  writing  J  instead  we  get 

1 01 1  <  Ck  (Ax)k+1/2  |u|hm-i(j)  (| c |  +  ch)  |  K]|, 

1 02  |  <  ckdk  (Ax)k  (a  |  u  |Hs+2(J)  ( Ax)k~k 

+ V®  |  C12  1 1  u  ||  vu  ||l2(j)  , 

1 03  |  <  ckdk  (Ax)k  (y/a  |  u  ( Ax)k~k 

+  1^12  |  MtfM-ifJ)))  \\Vq\\L*(J)- 

This  is  the  crucial  step  for  obtaining  our  error  estimates.  Note  that  the  treat¬ 
ment  of  0i  is  very  different  than  the  treatment  of  02  and  03.  The  reason  for 
this  difference  is  that  the  upper  bound  for  0i  can  be  controlled  by  the  form 
®r,c([v/i])-  we  recall  that  v/,  =  Pk (e) .  This  is  not  the  case  for  the  upper 
bound  for  02  because  ©T,c[xh]  =  0  if  c  =  0  nor  it  is  the  case  for  the  upper 
bound  for  03  because  ©T,c[xh]  does  not  involve  the  jumps  [u9]! 
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Thus,  after  a  suitable  application  of  Young’s  inequality  and  simple  alge¬ 
braic  manipulations,  we  get 

h(p)*  [Vft(t)]  <  “  Cn  [vu]2  +  ^11  Vq  11^2(7) 

+\ Chj(t)  (Ax)2k  +  c2,j(t )  (Ax)k  ||  vu  ||L2(J), 

where 

Ci, A*)  =  cl  (  —  ^  7- C  —  Ax  +  4\  c12  \2d2k)\  u(t)  \2Hk+HJ) 

Cn 

+  4acld2k(Ax)2^-^\u{t)\2Hi+1(J), 

and 

C2 ,j(i)  =  Cfc  dk  ^  \fo,  |  C12  |  w(f)  |rr*:+2(j)  +  a  (Ax)^~  ^|  u(t) 

Since 

Bh{p,Vh )  <  IoT,i<j<N  I  h(p)‘+1/2(i)  [vft(t)]J+1/2  |  dt , 

and  since  J 3+1/2  =  Ij  U  7,+1)  the  result  follows  after  simple  applications  of 
the  Cauchy-Schwartz  inequality.  This  completes  the  proof. 

Conclusion.  Combining  the  equation  (6.42)  with  the  estimate  of  Lemma 
31,  we  easily  obtain,  after  a  simple  application  of  Gronwall’s  lemma, 

f  1 1/2 

I  So  \  Bh(eu(T))(x)  I2  dx  +  fj1/*  |  Ph(eq(t))(x)  |2  dxdt  +  6T,c([Ph{e)])  j 

<  {Ax)ky^Ci{i)dt  +J0T  C2(t)dtJ. 

Theorem  27  follows  easily  from  this  inequality,  Lemma  33,  and  from  the 
following  simple  approximation  result: 

\\p~Ph(p)  llz/qo,!)  <  9k{Ax)k+1\p\H(k+i){0A) 


where  g k  depends  solely  on  k. 
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7  The  LDG  method  for  other  nonlinear  parabolic 
problems:  Propagating  surfaces 

7.1  Introduction 

In  this  chapter,  we  briefly  show  how  to  extend  the  LDG  method  to  nonlinear 
second-order  parabolic  equations.  We  consider  the  following  model  problem: 

<pt+F(D<p,D*<p)=  0,  in  (0,  l)d  x  (0,T), 

tp(x,  0)  =  <fo(x),  V  (ar)  e  (0,  l)rf, 

where  we  take  periodic  boundary  conditions  and  assume  that  F  is  nonincreas¬ 
ing  in  the  second  variable.  For  the  definition  and  properties  of  the  viscosity 
solution  of  this  and  more  general  problems  of  this  type,  see  the  work  by 
Crandall,  Ishii,  and  Lions  [29]. 

For  simplicity,  we  only  consider  the  two-dimensional  case,  d  =  2: 

‘Pt+F(<Px,<Py,‘Pxx,<Pxy,<Pyy)  =  0,  in  (0,  l)2  X  (0,  T),  (7.1) 

ip(x,  0)  =  <p0(x),  V  (or)  £  (0,  l)2,  (7.2) 

with  periodic  boundary  conditions.  The  material  presented  in  this  section  is 
based  in  the  work  of  Hu  and  Shu  [43] . 


7.2  The  method 

To  idea  to  extend  the  LDG  method  to  this  case,  is  to  rewrite  the  problem 


(7.1),  (7.2)  for  <p  as  follows: 

ipt  =  —F(u, v,p, q, r),  in  (0, 1)  x  (0, T),  (7.3) 

y>(a;,0)  =  <p0(x),  V  a:£  (0,1).  (7.4) 

where  (u,v,p,qp,r)  solves  the  following  problem: 

ut  +  F(u,  v,  p,  q,  r)x  =  0,  in  (0,  l)2  x  (0,  T) ,  (7.5) 

vt  +  H(u,v,p,q,r)y  =  0,  in  (0,  l)2  x  (0,T),  (7.6) 

p-ux  =  0,  in  (0,  l)2  x  (0, T),  (7.7) 

q~uy  =  0,  in  (0,  l)2  x  (0,T),  (7.8) 

r-  vx=0,  in  (0,  l)2  x  (0,T),  (7.9) 

u(x,y,0)  =  (ip0)x(x,y),  V  (x,y)  £  (0,  l)2,  (7.10) 

v{x,y,0)  =  (ip0)y(x,y),  V  (x,y)  £  (0,  l)2.  (7.11) 


Again,  a  straightforward  application  of  the  LDG  method  to  the  above  prob¬ 
lem  produces  an  approximation  (uh,Vh,Ph,qh,rh)  to  ( u,v,p,qp,r ).  We  can 
take  each  of  the  approximate  solutions  to  be  piecewise  a  polynomial  of  degree 
k  —  1.  Then,  we  define  the  approximation  iph  to  <f>  by  solving  the  problem 
(7.3),  (7.4)  in  the  manner  described  in  the  chapter  on  RKDG  methods  for 
multidimensional  Hamilton-Jacobi  equations. 
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7.3  Computational  results 

We  present  a  couple  of  numerical  results  that  display  the  good  performance 
of  the  method.  Our  main  purpose  is  to  show  that  the  method  works  well  if 
both  quadrangles  and  triangles  are  used. 

First  test  problem.  We  consider  the  problem  of  a  propagating  surface: 


f  (pt  -  (1  -  sK)  yj\  +  tp2+<p2  =  0,  OOcCl,  0<y<l 
\  v(x,y,  0)  =  1  -  i(cos(27ra:  -  1))  (cos(2t xy  -  1)) 

where  K  is  the  mean  curvature  defined  by 


(7.12) 


K  = 


fcCl  +  Py)  ‘2‘lfixy(Px(Py  +  tyyy  (1  +  Vx) 

(1  +<pl  +y?y)f 


(7.13) 


and  £  is  a  small  constant.  Periodic  boundary  condition  is  used. 

This  problem  was  studied  in  [72]  by  using  the  finite  difference  ENO 
schemes. 

We  first  use  a  uniform  rectangular  mesh  of  50  x  50  elements  and  the  local 
Lax-Friedrichs  flux.  The  results  of  e  =  0  (pure  convection)  and  e  =  0.1  are 
presented  in  Fig.  7.1  and  Fig.  7.2,  respectively.  Notice  that  the  surface  at 
T  =  0  is  shifted  downward  by  0.35  in  order  to  show  the  detail  of  the  solution 
at  T  =  0.3. 

Next  we  use  a  triangulation  shown  in  Fig.  7.3.  We  refine  the  mesh  around 
the  center  of  domain  where  the  solution  develops  discontinuous  derivatives 
(for  the  e  =  0  case).  There  are  2146  triangles  and  1128  nodes  in  this  triangu¬ 
lation.  The  solutions  are  displayed  in  Fig.  7.4  and  Fig.  7.5,  respectively,  for 
£  =  0  (pure  convection)  and  e  =  0.1.  Notice  that  we  again  shift  the  solution 
at  T  =  0.0  downward  by  0.35  to  show  the  detail  of  the  solutions  at  later  time. 

Second  test  problem.  The  problem  of  a  propagating  surface  on  a  unit 
disk.  The  equation  is  the  same  as  (7.12)  in  the  previous  example,  but  it  is 
solved  on  a  unit  disk  x2  +  y2  <  1  with  an  initial  condition 


(p(x,  y,  0)  =  sin 


^ir(x2  +y 2)^ 


and  a  Neumann  type  boundary  condition  Vip  —  0. 

It  is  difficult  to  use  rectangular  meshes  for  this  problem.  Instead  we  use 
the  triangulation  shown  in  Fig.  7.6.  Notice  that  we  have  again  refined  the 
mesh  near  the  center  of  the  domain  where  the  solution  develops  discontinuous 
derivatives.  There  are  1792  triangles  and  922  nodes  in  this  triangulation.  The 
solutions  with  e  =  0  are  displayed  in  Fig.  7.7.  Notice  that  the  solution  at  t  =  0 
is  shifted  downward  by  0.2  to  show  the  detail  of  the  solution  at  later  time. 

The  solution  with  e  =  0.1  are  displayed  in  Fig.  7.8.  Notice  that  the  so¬ 
lution  at  t  =  0  is  again  shifted  downward  by  0.2  to  show  the  detail  of  the 
solution  at  later  time. 
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Fig.  7.1.  Propagating  surfaces,  rectangular  mesh,  e  =  0. 


7.4  Concluding  remarks 

We  have  shown,  briefly,  how  to  extend  the  LDG  method  originally  devised 
for  nonlinear  convection-diffusion  equations  to  second-order  parabolic  equa¬ 
tions  that  have  a  viscosity  solution.  We  have  shown  that  the  method  works 
well  without  slope  limiting  and  that  it  works  well  in  both  quadrangles  and 
triangles. 
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P2, 50x50  elements 


P3, 50x50  elements 


Fig.  7.2.  Propagating  surfaces,  rectangular  mesh,  e  =  0.1. 
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Fig.  7.3.  Triangulation  used  for  the  propagating  surfaces. 
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Abstract.  These  notes  present  an  introduction  to  the  spectral  element  method 
with  applications  to  fluid  dynamics.  The  method  is  introduced  for  one- dimensional 
problems,  followed  by  the  discretization  of  the  advection  and  diffusion  operators 
in  multi-dimensions,  and  efficient  ways  of  dealing  with  these  operators  numeri¬ 
cally.  We  also  discuss  the  mortar  element  method,  a  technique  for  incorporating 
local  mesh  refinement  using  nonconforming  elements;  this  is  the  foundation  for 
adaptive  methods.  An  adaptive  strategy  based  on  analyzing  the  local  polynomial 
spectrum  is  presented  and  shown  to  give  accurate  solutions  even  for  problems  with 
weak  singularities.  Finally  we  describe  techniques  for  integrating  the  incompressible 
Navier-Stokes  equations,  including  methods  for  performing  computational  linear 
and  nonlinear  stability  analysis  of  non-parallel  and  time-periodic  flows. 
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1  Introduction 

High-order  numerical  methods  have  been  used  almost  exclusively  in  the  di¬ 
rect  numerical  simulation  of  turbulent  flows  in  the  last  two  decades.  Under 
the  broad  heading  of  “high-order  methods”  we  include  expansions  based  on 
Fourier  series,  orthogonal  polynomial  series,  and  compact  finite  difference 
schemes.  These  methods  have  been  used  in  studies  of  transition  and  turbu¬ 
lence  because  they  offer  fast  convergence,  have  small  numerical  dissipation 
and  dispersion  errors,  and  can  be  implemented  efficiently  on  most  modern 
computer  architectures,  including  vector  and  parallel  supercomputers.  Al¬ 
though  they  have  a  higher  computational  cost  per  grid  point  than  low-order 
finite  difference,  finite  volume,  or  finite  element  schemes,  they  are  ultimately 
more  efficient  for  the  long-time  integration  of  unsteady  flow  problems  [55]. 

For  all  their  advantages,  there  are  two  key  issues  that  prevent  these  meth¬ 
ods  from  being  applied  to  more  general  problems  in  fluid  dynamics:  the  ability 
to  simulate  flows  through  geometric  complex  domains  with  general  bound¬ 
ary  conditions,  and  the  ability  to  incorporate  local  mesh  refinement  as  part 
of  the  convergence  process.  In  these  notes  we  describe  a  class  of  discretiza¬ 
tions  that  have  the  advantages  of  global  spectral  methods  outlined  above,  but 
are  not  subject  to  their  limitations  of  simple  geometries  and  uniform  grids. 
These  newer  techniques  go  under  the  name  of  spectral  and  h-p  finite  element 
methods,  or  simply  “spectral  elements”  as  they  will  frequently  be  referred  to 
here. 

Spectral  element  methods  combine  the  generality  of  finite  element  meth¬ 
ods  with  some  basic  ideas  from  approximation  theory  about  what  constitutes 
a  “good”  interpolant.  By  subdividing  a  complex  domain  into  macro-elements, 
they  can  provide  accurate  solutions  to  many  problems  with  substantially 
fewer  degrees  of  freedom  than  low-order  discretizations.  High  accuracy  comes 
from  the  use  of  orthogonal  polynomial  expansions  to  represent  the  solution 
over  a  single  element.  Galerkin  projection  operators  relate  the  differential 
and  algebraic  equations  and  keep  the  global  system  “sparse”  by  imposing  the 
minimal  continuity  requirement  on  the  approximate  solution.  However,  the 
ability  to  simulate  more  general  problems  with  arbitrarily  high-order  accuracy 
does  not  come  for  free!  A  polynomial  spectral  code  with  domain  decompo¬ 
sition  and  adaptive  mesh  refinement  capabilities  is  much  more  complex  that 
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either  its  Fourier  series  or  finite  element  counterpart.  One  purpose  of  these 
notes  is  assure  the  reader  that  the  benefits  of  spectral  element  methods  far 
outweigh  the  cost  of  implementation. 

Spectral  and  h-p  finite  element  methods  are  most  commonly  based  on 
Chebyshev  and  Legendre  polynomials.  These  are  complete  orthogonal  sets 
that  can  be  computed  easily  from  a  three-term  reccurence  formula.  How¬ 
ever,  other  polynomials  can  be  useful  for  special  cases.  All  of  the  “good” 
polynomial  series  for  numerical  methods  are  derived  from  the  same  class  of 
Jacobi  polynomials,  P“,/3( x).  These  are  the  eigenfunctions  of  an  appropri¬ 
ately  defined  singular  Sturm-Liouville  problem.  These  polynomials  form  an 
expansion  basis  for  representing  square- integrable  functions  u{x)  €  L-2-  The 
unknowns  of  the  expansion  could  be  the  nodal  values  of  the  function  on  a 
selected  grid  or  other  coefficients  that  weight  the  importance  of  polynomials 
(modes)  of  different  order.  The  details  depend  on  exactly  how  the  basis  is 
formed  and  implemented. 

Eigenfunction  expansions  based  on  singular  Sturm-Liouville  problems 
converge  at  a  rate  governed  by  the  regularity  (smoothness)  of  the  function 
being  expanded  and  not  by  any  special  boundary  conditions.  Numerical  so¬ 
lutions  of  differential  equations  based  on  these  expansions  have  the  same 
property.  This  observation  is  important  for  fluid  dynamics,  especially  for 
simulations  of  incompressible  flows  since  these  flows  are  free  of  discontinu¬ 
ities  and  can  typically  be  approximated  well  by  polynomials.  If  the  solution 
is  sufficiently  smooth  then  the  discretization  error  decays  exponentially  fast 
to  zero,  at  least  asymptotically.  Doubling  the  grid  resolution  reduces  the  er¬ 
ror  by  two  orders  of  magnitude,  not  by  a  mere  factor  of  four  as  in  typical 
methods  with  second-order  algebraic  convergence.  Fast  convergence  is  one 
key  to  the  computational  efficiency  of  high-order  methods:  they  often  require 
a  higher  operation  count  than  low-order  methods  for  a  given  number  of  de¬ 
grees  of  freedom,  but  they  require  fewer  degrees  of  freedom  for  a  given  level 
of  accuracy. 

Exponential  convergence  of  numerical  solutions  in  practical  situations  de¬ 
pends  on  a  number  of  factors.  Although  frequently  cited  as  the  primary  mo¬ 
tivation  for  using  high-order  methods,  exponential  convergence  only  occurs 
once  all  but  the  exponentially  small  high-order  components  of  an  approx¬ 
imation  have  been  resolved;  it  is  probably  the  exception  rather  than  the 
rule  in  simulations  of  complex  phenomena  like  turbulent  flows.1  Convergence 
is  tied  closely  to  issues  like  the  non-uniformity  of  the  mesh,  the  form  of 
geometric  singularities  (e.g.  corners),  discontinuities  in  the  boundary  condi¬ 
tions,  and  so  forth.  Such  features  degrade  convergence  because  they  propa¬ 
gate  into  the  high-order  components  of  the  solution.  These  features  must  be 
isolated  or  resolved  before  fast  convergence  is  realized.  Multidomain  spectral 


1  There  are  other  advantages,  such  as  low  numerical  dissipation  and  dispersion 
errors,  that  make  high-order  methods  attractive  candidates  for  simulating  tur¬ 
bulence  even  though  a  flow  may  be  marginally  resolved. 
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discretizations  like  the  ones  considered  here  offer  such  a  possibility  due  to 
their  dual  path  of  convergence.  The  accuracy  of  the  numerical  model  can 
be  increased  in  two  ways:  by  increasing  the  number  of  subdomain  elements 
(/i-refinement),  or  by  increasing  the  polynomial  order  of  a  fixed  number  of 
elements  (p-refinement);  this  flexibility  makes  the  methods  robust. 

The  following  example  demonstrates  some  of  the  advantages  and  limita¬ 
tions  of  spectral  elements.  Figure  1.1  shows  results  from  a  simulation  of  flow 
past  a  half-cylinder  [43].  This  simulation  could  not  be  performed  with  any 
method  based  on  global  expansions  because  the  domain  cannot  be  mapped  to 
a  simpler  form.  Domain  decomposition  is  a  natural  choice  for  the  discretiza¬ 
tion.  However,  the  sharp  corner  of  the  body  and  the  relatively  thin  shear  layer 
make  the  flow  difficult  to  resolve.  In  the  lower  image  there  are  obvious  “wig¬ 
gles”  in  the  computed  vorticity  field  indicative  of  insufficient  resolution.  These 
are  equivalent  to  the  familiar  aliasing  errors  in  Fourier  spectral  methods,  but 
manifest  in  the  high-order  components  of  the  polynomial  approximation.  In¬ 
creasing  the  polynomial  order  in  this  case  is  a  particularly  inefficient  way  to 
improve  the  approximation  —  the  geometric  singularity  prevents  fast  conver¬ 
gence.  The  fix  is  to  perform  local  mesh  refinement  of  the  boundary  layer  and 
near- wake  as  shown  in  the  upper  part  of  the  figure.  Again,  no  method  based 
on  global  expansions  is  capable  of  this  path  to  convergence. 
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Fig.  1.1.  Vorticity  in  the  wake  of  a  half-cylinder  at  Re  =  250:  (a)  locally  refined 
mesh  using  nonconforming  spectral  elements  to  resolve  the  boundary  layer  and  near 
wake;  ( b )  conforming  mesh  where  the  solution  exhibits  “wiggles”  due  to  insufficient 
resolution.  Both  simulations  are  performed  with  order  p  —  7. 
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Spectral  elements,  like  finite  elements,  require  that  each  subdomain  in  the 
mesh  be  conforming,  that  is  aligned  edge  by  edge  with  each  neighboring  sub- 
domain.  This  requirement  is  a  natural  result  of  the  continuity  imposed  on  the 
discrete  solution.  Unlike  finite  elements,  spectral  elements  represent  a  coarse 
discretization  of  the  geometry  and  achieve  high  accuracy  by  using  a  fine  mesh 
on  the  interior  of  each  element.  Conforming  finite  elements  are  not  partic¬ 
ularly  restrictive,  but  conforming  spectral  elements  make  mesh  refinement 
difficult  to  implement  and  the  improved  solution  expensive  to  compute. 

Notice  that  the  refined  mesh  in  figure  1.1  contains  nonconforming  el¬ 
ements.  These  are  elements  that  do  not  connect  to  an  entire  neighboring 
edge,  and  as  a  result  special  constraints  are  required  to  impose  the  correct 
continuity  conditions  on  the  solution.  In  spite  of  the  increased  complexity, 
nonconforming  elements  are  key  to  the  efficient  implementation  of  adaptive 
mesh  refinement  for  spectral  element  methods.  They  eliminate  the  need  for 
refinement  boundaries  that  propagate  through  the  entire  domain,  allowing 
refinement  to  be  done  locally  as  dictated  by  some  appropriate  error  indica¬ 
tor. 

Background  material  for  these  notes  can  be  found  in  the  monographs 
by  Gottlieb  and  Orszag  [34],  Canuto  et  al.  [20],  and  Boyd  [17].  These  ref¬ 
erences  cover  global  spectral  methods  extensively,  i.e.  expansions  on  a  sin¬ 
gle  computational  domain.  The  review  article  of  Maday  &  Patera  [57]  also 
provides  background  material,  concentrating  exclusively  on  conforming  dis¬ 
cretizations.  Early  work  with  spectral  elements  focused  primarily  on  meshes 
composed  of  quadrilateral  or  hexahedral  elements.  More  recent  work  has 
made  important  advances  in  the  formulation,  including  meshes  of  noncon¬ 
forming  elements  and  triangular  and  tetrahedral  elements.  These  new  tools 
are  the  cornerstones  of  adaptive  mesh  generation  and  true  h-p  refinement. 
This  is  the  class  of  algorithms  emphasized  in  these  notes.  In  addition  to 
the  basic  theory  and  implementation  of  spectral  element  methods,  we  also 
discuss  a  number  of  applications  to  the  simulation  of  incompressible  flows. 
Finally  we  discuss  useful  methods  for  studying  flow  instabilities,  transition, 
and  turbulence  —  all  ideal  applications  of  spectral  element  methods. 

2  One-Dimensional  Problems 

Most  of  the  basic  numerical  machinery  required  for  spectral  element  methods 
can  be  described  in  terms  of  one-dimensional  problems.  In  this  section  we 
provide  a  step-by-step  formulation  of  a  spectral  element  solver  for  a  model 
advection-diffusion  equation  to  illustrate  the  procedure  before  going  on  to 
the  Navier-Stokes  equations.  In  higher  dimensions  we  have  to  worry  about 
representing  the  geometry  with  more  complicated  elements,  but  most  of  the 
basic  operations  are  the  same. 

While  reading  this  section,  keep  the  following  point  in  mind:  the  proce¬ 
dure  used  to  derive  a  “spectral”  element  method  is  exactly  the  same  as  that 
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used  to  derive  a  finite  element  method.  Any  finite  element  discretization  can 
be  extended  to  higher  order  using  the  methods  we  discuss  here.  What  we 
emphasize  are  efficient  ways  to  achieve  high-order  accuracy  within  the  finite 
element  framework,  using  concepts  developed  originally  for  spectral  methods. 
To  stress  this  connection,  we  try  to  keep  the  notation  as  close  as  possible  to 
that  used  in  standard  finite  element  textbooks. 


2.1  Galerkin  formulation 


Suppose  we  want  to  find  u  such  that 

u"  +  f  —  0  on  fl,  (2.1) 

where  Q  is  the  unit  interval  0  <  x  <  1  and  /  :  [0, 1]  -4  1Z  is  a  given  smooth 
function.2  At  the  endpoints  we  will  specify  the  boundary  conditions 

u(  0)  =  g,  (2.2a) 

u'(l)  =  h.  (2.2b) 


This  defines  the  strong  form  of  the  problem,  the  usual  starting  point  for  finite 
difference  and  spectral  collocation  schemes. 

Consider  the  following  alternative  formulation  of  the  same  problem.  We 
begin  with  the  equation  for  the  residual, 

R(u)  =  f  w(u"  +  /)  dx,  (2.3) 

Jn 

from  which  we  want  to  find  the  unique  function  u  that  drives  the  residual  to 
zero.  The  search  will  include  all  functions  satisfying  the  boundary  condition 
u(0)  =  g;  each  candidate  is  called  a  trial  solution,  and  we  denote  the  set  of  all 
trial  solutions  by  <S.  The  residual  is  orthogonalized  with  respect  to  a  second 
set  of  functions  w  G  V  called  test  functions  or  variations.  Each  test  function 
should  satisfy  w( 0)  =  0.  To  incorporate  the  Neumann  boundary  condition 
we  integrate  (2.3)  once  by  parts,  finding  that  R(u)  =  0  if 


/ 

Jn 


w'u'  dx 


-I-  iu(l)ft. 


(2.4) 


For  this  expression  to  make  sense,  both  u  and  w  must  have  square-integrable 
first  derivatives,  i.e.  Jq(u')2  dx  <  oo.  Recognizing  that  such  functions  belong 
to  the  Sobolev  space  H 1 ,  we  can  summarize  the  sets  of  trial  and  test  functions 
as: 

S  =  {u  |  u  €  H1,  u(0)  =ff},  ,  . 

V  =  {w  |  w  £  H1,  w(0)  =  0}.  '  ’  ' 

2  We  use  the  term  smooth  as  a  qualitative  description  of  a  function’s  higher  deriva¬ 
tives.  A  smooth  function  f(x)  has  bounded  higher  derivatives  f'r\x). 
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If  we  identify  the  symmetric,  bilinear  forms  a(w,  u)  =  fn  w'u'  da;  and  (w,  /)  = 
J0  wf  dx,  then  we  can  state  the  weak  form  as  follows:  find  u  E  S  such  that 
for  every  w  E  V 


a(w,u)  =  ( w,f )  +w(l)h. 


(2.6) 


Equation  (2.6)  is  still  an  infinite-dimensional  problem,  because  the  spaces 
S  and  V  each  contain  an  infinite  number  of  functions.  Galerkin  approximation 
solves  (2.6)  using  a  finite  collection  of  functions:  find  uh  E  Sh  such  that  for 
every  wh  EVh 

a(wh,uh)  =  (wh,f)+wh(l)h.  (2.7) 

This  method  reduces  an  infinite-dimensional  problem  to  an  n-dimensional 
problem  by  choosing  a  set  of  n  basis  functions  (<j>i  ,  <t>n)  to  represent 

each  member  of  Sh  and  Vh .  It  admits  all  linear  combinations  wh  E  Vh  as 
wh  =  ci 0i  +  C2<f>2  +  ■  ■  ■  +  cn(j>n,  where  each  4>p(0)  =  0.  To  generate  the  trial 
solutions  we  need  one  additional  function  satisfying  <^>n+i(0)  =  1  so  that  if 


E  <Sft  then 

n 

uh  =g<f>n+ 1  +^dp</>p. 

P=  1 


(2.8) 


Note  that  with  the  exception  of  (j>n+\,  Sh  and  Vh  are  composed  of  the  same 
functions. 

Substituting  uh  for  u  and  wh  for  w,  the  weak  form  becomes 


t:  cpGp 

p= i 


0, 


(2.9) 


where 


Gp  =  2  .  [a(*Ap)  (f>q)dq 
9=1 

~{(t>pi  /)  —  ^p(l)h  +  a(<j)p,  (f)n+i)g ] . 


(2.10) 


Since  this  must  be  true  for  any  choice  of  the  cp’s,  we  require  Gv  =  0.  If  we  put 
the  coefficients  dp  into  a  vector  d,  it  becomes  the  matrix  problem  Ad  =  F, 
where  the  matrix  entries  are  given  by  Apq  =  a (<f>p,(j)q)  and  the  components 
of  the  vector  F  are  Fp  =  (<pp,  f  )  +  <Pp{l)h  —  a(<pp,<f>n+i)g.  The  solution  is 
d  =  A_1F.  Quite  literally,  this  is  a  best  fit  of  the  approximate  solution  uh 
to  the  true  solution  u  based  on  the  measure  of  error  given  in  (2.3). 

The  Galerkin  formulation,  treated  in  most  standard  texts  on  finite  ele¬ 
ment  methods  [44,76],  is  one  example  of  a  general  class  of  techniques  called 
weighted  residual  methods  [29].  For  certain  differential  equations  it  reproduces 
the  underlying  variational  principle  if  one  exists.  The  idea  behind  a  varia¬ 
tional  principle  is  that  some  physical  quantity,  such  as  potential  energy,  is 
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minimized  over  the  problem  domain.  For  example,  the  Rayleigh-Ritz  princi¬ 
ple  corresponding  to  (2.1)  minimizes  the  quadratic  form 

I(u)  =  i  [  (u1)2  dx  -  [  ufdx.  (2.11) 

2  J  q  J  n 

The  Galerkin  formulation  produces  the  same  solution,  but  it  can  be  developed 
even  for  differential  equations  that  have  no  corresponding  variational  form. 


2.2  Basis  functions 

Galerkin  approximation  is  “optimal”  in  the  sense  that  it  gives  the  best  ap¬ 
proximation  in  the  restricted  space  Sh.  If  the  true  solution  u  lies  in  the 
intersection  of  Sh  and  S,  then  uh  =  u.  But  the  success  of  the  method  lies 
in  the  selection  of  the  basis  functions.  If  they  are  too  complicated  it  will  be 
impossible  to  generate  the  matrix  problem,  too  simple  and  they  cannot  ad¬ 
equately  describe  the  true  solution  u.  The  key  is  to  combine  computability 
and  accuracy.  Spectral  elements  accomplish  this  in  the  following  manner. 

First,  the  domain  is  partitioned  into  K  non-overlapping  subintervals, 
where  each  subinterval,  or  element,  is  given  by  Qk  —  [ak,bk],  On  element 
k  we  want  to  introduce  a  set  of  local  functions  that  provide  accuracy  of  order 
N  for  the  solution  over  that  piece  of  the  computational  domain.  For  spectral 
element  methods,  the  basis  functions  are  invariably  polynomials. 

Often  the  most  convenient  approach  is  to  form  a  set  of  polynomials  from 
the  Lagrangian  interpolants  through  a  particular  set  of  nodes.  Recall  that  the 
Lagrangian  interpolant  takes  the  value  one  at  some  node  Xi  and  is  zero  at  all 
other  nodes.  The  simplest  set  of  nodes  would  be  the  equally  spaced  points 
Xi  =  ak  +  (bk  —  ak)  i/N.  Of  course,  this  turns  out  to  be  a  terrible  choice  for  a 
high-order  method  because  the  basis  is  almost  linearly  dependent,  resulting 
in  ill-conditioned  algebraic  systems.  It  is  not  the  choice  of  Lagrangian  inter¬ 
polants  but  the  choice  of  nodes  we  define  them  over,  so  to  fix  the  problem  we 
just  need  to  choose  a  “good”  set  of  nodes,  and  this  is  where  spectral  methods 
start  to  shape  the  formulation. 

To  standardize  the  basis,  we  introduce  a  parent  domain  with  the  coordi¬ 
nates  — 1  <  £  <  1,  and  a  coordinate  transformation  to  the  elemental  nodes 
as 


J\k  k 

Xi=ak+  ■  ~~a~  (1  +  &)■  (2.12) 

Now  we  choose  the  nodes  &  to  be  the  solutions  of  (1  -  £2)  L'N(£)  =  0,  where 
Ljv(£)  is  the  Legendre  polynomial  of  degree  N.  With  this  special  choice,  the 
Lagrangian  interpolants  can  be  written  down  explicitly  as 


MO  =  - 


a  -e)L'N(o 

n(n  +  i)ln(0)  (*-&)' 


(2.13) 
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Fig.  2.1.  One-dimensional  spectral  element  basis  functions  for  an  expansion  order 
of  N  —  4,  along  with  a  sketch  of  the  local  and  global  coordinate  systems:  (a)  modal 
basis  constructed  from  W  Gauss-Lobatto  Legendre  basis  and  the  set  of 

nodal  points  that  define  them  as  Lagrangian  interpolants. 


These  polynomials  are  called  the  Gauss-Lobatto  Legendre  (GLL)  interpolants. 
Figure  2.1  illustrates  the  mesh  and  basis  functions  for  a  typical  element.  We 
will  refer  to  any  basis  defined  this  way  as  a  nodal  basis. 

There  are  several  important  reasons  for  choosing  this  set  of  polynomi¬ 
als.  First,  the  expansion  of  any  smooth  function  using  the  GLL  interpolants, 
u  Ri  uh  =  ^2di(f>i(x ),  converges  exponentially  fast,  as  can  be  demonstrated 
by  singular  Sturm-Liouville  theory  [34].  Because  these  are  Lagrangian  inter¬ 
polants,  the  coefficients  di  are  simply  the  nodal  values  of  the  approximate 
solution:  di  =  uh(xi).  Also,  there  is  a  set  of  integration  weights  pi  associated 
with  the  nodes  &  so  that  the  integrals  appearing  in  the  weak  form  can  be 
computed  via  the  GLL  quadrature 


rl  N 

/  /d£  =  +  Gv, 

■'~1  i= 0 


(2.14) 


where  the  error  e /v  ~  0(f2N(Q)  for  some  point  —  1  <  £  <  1;  as  long  as 
the  integrand  is  a  polynomial  of  degree  less  than  2 N  this  quadrature  rule  is 
exact  [25].  Finally,  and  perhaps  most  importantly,  the  interpolants,  quadra¬ 
ture  points,  and  weights  can  be  generated  within  a  computer  program  by 
recursive  algorithms  that  are  numerically  stable  through  values  of  N  ~  100, 
eliminating  the  need  to  store  static  tables  of  quadrature  data. 
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Legendre  polynomials  are  one  example  of  a  broad  polynomial  class  called 
the  generalized  Jacobi  polynomials,  which  we  denote  as  Legendre 

polynomials  correspond  to  the  parameter  values  a  =  0,  /3  =  0.  Sometimes, 
especially  in  higher  dimensions  and  on  more  complex  domains,  it  is  more 
convenient  to  work  directly  with  the  polynomials  rather  than  an  intermediate 
Lagrangian  basis.  Jacobi  polynomials  have  the  orthogonality  property 

f\  1  -  0“(1  +  =  Sij.  (2.15) 

We  can  use  Jacobi  polynomials  directly  to  represent  a  function  through  the 
expansion  uh  =  Y^diP^'^{x).  The  values  di  are  the  coefficients  of  the  basis 
functions  but  they  do  not  correspond  to  any  set  of  nodal  values.  In  practice, 
there  is  a  significant  advantage  if  most  of  the  basis  functions  are  orthogonal, 
so  in  the  one-dimensional  case  we  would  use: 

to(0  =  ?(i  +  0. 

MO  =  U  1-0,  (2.16) 

&(0  =  1(1  +  0(1  -0^2(0.  *>2. 

Figure  2.1  shows  the  first  five  basis  functions  constructed  this  way.  In  the 
nodal  basis  every  function  is  a  polynomial  of  degree  N.  In  the  modal  basis 
there  is  a  hierarchy  of  modes  starting  with  the  linear  modes,  proceeding  with 
the  quadratic,  the  cubic,  and  so  on.  Such  a  basis  can  accommodate  hierarchi¬ 
cal  p-refinement  more  readily  by  increasing  the  polynomial  order.  It  is  also 
useful  to  distinguish  between  hierarchic  and  non-hierarchic  representations. 
In  a  hierarchic  basis  we  can  easily  define  a  sequence  of  approximation  spaces 
such  that  Sn  C  <Sn+1.  This  ensures  that  the  error  decreases  monotonically; 
in  non-hierarchic  constructions  this  may  or  may  not  be  possible  [7]. 

We  will  refer  to  spectral  elements  constructed  from  a  nodal  basis  as  La¬ 
grange  spectral  elements  and  to  those  based  on  a  modal  basis  as  h-p  ele¬ 
ments.  The  latter  were  first  introduced  in  the  early  seventies  by  Szabo  [77] 
who  used  the  integrals  of  Legendre  polynomials  as  a  modal  basis,  taking 
4>i{£)  =  ft x  (s)  ds.  However,  using  the  properties  of  Jacobi  polynomi¬ 
als  [1]  we  obtain 

2n  S-i  P°-' (s)  ds  =  (1  -  0 0  +  0P»-2 (0 >  (2-17) 

which  is  the  same  as  the  basis  in  (2.16)  except  for  the  normalization. 

The  choice  of  which  approach  to  take  is  somewhat  arbitrary  since  a  nodal 
basis  can  always  be  transformed  to  an  equivalent  modal  basis  and  vice  versa. 
The  Fast  Fourier  Transform  (FFT)  is  one  familiar  example  of  such  a  trans¬ 
formation  onto  the  basis  <f>k(Q  =  exp (ik£).  Unfortunately,  there  are  no  “fast 
transform”  methods  for  Jacobi  polynomials  and  the  transforms  require  ma¬ 
trix  multiplication.  However,  for  the  values  of  N  used  in  practice  (N  <  16) 
this  is  not  a  serious  drawback.  Note  that  for  a  given  polynomial  order,  the  for¬ 
mal  accuracy  of  any  basis  is  the  same.  Although  the  modal  basis  may  at  first 
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appear  to  have  an  advantage  for  performing  local  p  refinement,  the  nodal 
basis  can  be  implemented  as  a  matrix-free  method  that  suffers  no  penalty 
for  increasing  the  local  polynomial  order.  The  simplicity  of  working  with 
grid-point  values  in  the  nodal  basis  is  an  attractive  feature.  Ultimately,  the 
decision  is  a  matter  of  personal  choice — there  is  no  convincing  argument  for 
the  exclusive  use  of  one  basis  type  over  the  other. 

For  the  remainder  of  this  section  we  will  work  with  the  GLL  polynomials, 
but  when  we  introduce  the  basis  on  triangular  and  tetrahedral  subdomains 
we  will  switch  back  to  the  modal  point  of  view. 

2.3  Discrete  equations 

Returning  to  the  problem  of  solving  (2.7),  we  begin  by  noting  that  the  integral 
can  be  broken  into  a  sum  of  integrals  of  each  element: 

K 

®(*Ap!  (i>q)n  =  'y  '  (Pq)nk  ■ 

k= 1 

Since  each  basis  function  is  non-zero  over  a  single  element,  the  inner  product 
a(<j)p,<j)q)  is  non-zero  only  if  (f)p  and  <f>q  “belong”  to  the  same  element.  This 
makes  the  global  system  sparse,  and  allows  us  to  compute  only  local  matrices. 
Because  of  the  origin  of  finite  element  methods  in  computational  mechanics, 
these  matrices  are  traditionally  called: 

“mass”  Mp?  =  JQk  <j>pij)q  dx, 

“stiffness”  Akpq  =  JQk  <j>'p<j>'q  dx. 

To  construct  the  right-hand  side  of  the  matrix  system,  f(x)  is  approximated 
by  collocation  at  the  nodal  points  to  produce  fh(x );  the  mass  matrix  pro¬ 
vides  the  coefficients  necessary  to  perform  the  integration.  Now  the  elemental 
matrix  system  may  be  written  as 

Akvk  =Fk  (+ boundary  terms).  (2.18) 

Just  as  the  integral  over  the  entire  domain  can  be  written  as  a  sum  of  the 
integral  over  each  element,  the  global  matrices  can  be  computed  by  summing 
contributions  from  the  elemental  matrices: 

K  K 

A=]T'Afc,  M=^'Mfc.  (2.19) 

k~  1  fc= 1 

The  symbol  represents  “direct  stiffness  summation,”  the  procedure  dia¬ 
grammed  for  the  nodal  basis  in  Fig.  2.2  that  maps  contributions  from  the 
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boundary  node  shared  by  adjacent  elements  to  the  same  row  of  the  global 
matrix  A .  The  global  matrix  system  is 

Av  =  F  (+ boundary  terms).  (2.20) 

A  is  banded  as  a  result  of  using  local  basis  functions,  with  all  of  its  non-zero 
entries  located  in  the  N  diagonals  above  and  below  the  main  diagonal.  It  is 
also  symmetric,  due  to  the  symmetry  of  a(-,  •),  and  positive-definite.  Thus  A 
can  be  computed,  stored,  and  factored  economically  and  efficiently. 


Fig.  2.2.  Schematic  of  the  direct  stiffness  summation  of  local  matrices  Ak  to  form 
the  global  matrix  A. 


Spectral  element  discretizations  encompass  both  spectral  methods  and 
finite  elements.  Standard  approximation  error  estimates  for  Galerkin  methods 
applied  to  elliptic  problems  on  quasi-uniform  meshes  predict  that 

| \u  -  uh||i  <  const,  x  /^-1A-(*-1>|M|jfe,  (2.21) 

where  p  =  min (k,N  +  1),  N  is  the  polynomial  degree  appearing  in  the  basis 
functions,  and  h  is  a  parameter  related  to  the  element  size  [7].  The  constant 
depends  on  the  degree  of  mesh  quasi-uniformity.  There  are  two  ways  to  im¬ 
prove  the  approximation:  make  h  smaller  ( K  —>  oo),  or  make  N  and  p  larger 
( N  — >  oo).  The  latter  results  in  exponential  convergence  for  smooth  solu¬ 
tions.  If  a  solution  varies  rapidly  over  a  small  region,  any  polynomial  fit  will 
oscillate  rapidly  and  the  best  approach  is  to  reduce  the  element  size  until 
the  solution  is  resolved  locally.  A  more  effective  approach  is  to  combine  the 
two  convergence  procedures,  increasing  both  K  and  N  simultaneously;  this 
dual  path  of  convergence  is  known  as  an  h-p  refinement  procedure  [77].  The 
flexibility  to  adapt  the  mesh  to  the  solution  makes  spectral  element  methods 
quite  robust.  The  following  example  clarifies  these  concepts. 
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2.4  Example:  Burgers  equation 


Consider  the  nonlinear  differential  equation 


du  1  dn2  _  d2u 
dt  2  dx  V  dx1  ’ 


(2.22) 


subject  to  the  homogeneous  boundary  conditions  u(— 1)  =  u(  1)  =  0,  and 
smooth  initial  conditions.  Introduced  by  J.  M.  Burgers  [19],  this  equation 
represents  a  simplified  model  of  the  more  complicated  Navier-Stokes  equa¬ 
tions  that  captures  the  essential  features  of  incompressible  fluid  dynamics:  an 
unsteady  term,  a  nonlinear  advection  term,  and  a  viscous  diffusion  term.  Our 
goal  is  a  numerical  method  to  follow  the  evolution  of  a  waveform  governed 
by  this  equation. 

Let  un{x)  sa  u(x,tn)  be  the  approximate  solution  at  time  level  tn  =  nAt, 
where  At  is  the  time  step  and  n  is  the  time  step  number.  In  order  to  treat 
the  linear  and  nonlinear  terms  in  the  most  efficient  way  possible,  we  can 
integrate  (2.22)  using  the  two-step  splitting  scheme 


u-un 

At 

un+1  -  u 
At 


9=0 


(2.23a) 

(2.23b) 


The  nonlinear  term  is  treated  explicitly  with  a  third-order  Adams-Bashforth 
scheme  while  the  linear  term  is  handled  with  an  unconditionally  stable, 
second-order  Crank-Nicolson  scheme.  The  values  of  the  f3q’s  are: 


(2.24) 


Since  u  is  just  an  intermediate  solution  used  to  decouple  the  two  steps,  bound¬ 
ary  conditions  will  only  be  applied  in  the  diffusion  step  to  un+1 . 

Spectral  elements  form  the  spatial  discretization,  so  on  element  k  we  have 


N 

Un(x)  =  on  nk,  (2.25) 

i=0 

where  the  basis  coefficients  u,  are  to  be  determined  at  each  new  time  level. 
First  we  take  the  nonlinear  step, 


2 

5=0 


using  explicit  collocation: 

£(”)2  =  | 


(2.26) 


on  flk. 


(2.27) 
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This  expression  is  evaluated  at  every  nodal  point.  To  compute  the  Galerkin 
approximation  to  the  diffusion  step,  (2.23b)  is  first  written  in  the  form 


2 

vAt 


(u  +  un), 


(2.28) 


where  v  =  |(u"+1  +  un).  This  form,  called  a  Helmholtz  equation,  is  simply 
(2.1)  with  an  additional  term  multiplying  v.  The  spectral  element  approxi¬ 
mation  of  (2.28)  results  in  the  algebraic  system 


A 


(u  +  u"), 


(2.29) 


where  A  and  M  are  the  global  stiffness  and  mass  matrices  defined  in  (2.19), 
and  v,  u,  and  u”  are  vectors  containing  the  basis  coefficients  that  determine 
the  approximation  vh  ss  v,  etc.  The  solution  at  the  new  time  level  is  un+1  = 
2v  —  un. 


Fig.  2.3.  Evolution  of  a  sinusoidal  wave  governed  by  the  viscous  Burgers  equation 
with  v  —  10“2/7r.  The  structure  of  the  wave  is  shown  at  times  from  t  =  0  to 
t  =  10/ 7T. 


Burgers’  equation  can  be  solved  analytically  for  certain  initial  conditions. 
Figure  2.3  shows  how  an  initial  sinusoidal  wave  evolves  into  a  steep  sawtooth 
wave  at  a  time  near  t  =  1/n.  The  exact  solution  is  given  by 


u(x,t)  =  47 TV  nan e  n27r2tv  sinn'KX / 

(a0  +  2  ane~n^tv  cosn7ra;)j 


(2.30) 


Adaptive  Spectral  Element  Methods  239 


where  an  =  (— l)n/n(l/27ri/)  and  In(z)  is  the  modified  Bessel  function  of  the 
first  kind  [13].  As  long  as  the  viscosity  v  is  finite  the  profile  is  continuous 
but  varies  rapidly  within  a  narrow  region  around  the  origin.  The  value  of  the 
slope  at  the  origin  and  the  time  at  which  it  reaches  a  maximum  provide  a 
measure  of  both  spatial  and  temporal  errors  in  the  approximation. 

Figure  2.4  shows  a  sequence  of  mesh  refinements  in  which  the  elements 
near  the  origin  are  halved  in  size  while  the  polynomial  order  is  held  fixed  at 
N  —  10  (/i-refinement).  On  the  coarsest  mesh  the  solution  begins  to  oscillate 
as  the  wave  becomes  steeper  but  eventually  recovers  as  the  thin  inner  layer 
diffuses  outward.  Each  mesh  in  Fig.  2.4  contains  the  same  number  of  points — 
the  only  difference  is  the  size  of  the  elements,  and  therefore  the  distribution 
of  points  in  the  domain.  By  clustering  points  near  the  origin,  the  final  mesh 
resolves  the  thin  inner  layer  and  improves  the  solution  without  increasing  the 
computational  cost. 

This  final  mesh,  with  ( K,N )  =  (4,16),  gives  four  significant  digits  for 
both  max(|<9u/<9:r|)  =  152.06  and  the  corresponding  time  nt  =  1.6033.  Even 
with  a  coarser  mesh,  Fig.  2.5(a)  shows  that  the  wave  moves  at  the  correct 
speed  towards  the  origin.  Figure  2.5(b)  verifies  that  the  approximation  to  the 
derivative  converges  exponentially,  and  in  fact  the  error  eh  —  d(u  -  uh)/dx 
is  bounded  by 

log ||eh||oo  <  eriV  +  logllitlloo  +  const.,  (2.31) 

where  a  &  -1/4.  The  scatter  in  the  convergence  data  is  due  in  part  to  the 
different  approximation  properties  of  odd  versus  even  order  polynomials.  A 
general  comparison  of  convergence  properties  and  approximation  errors  for 
spectral  element,  finite  difference,  and  global  spectral  methods  applied  to  the 
viscous  Burgers  equation  is  given  in  [12]. 

We  have  just  observed  two  important  properties  of  spectral  element  ap¬ 
proximations.  First,  high-order  spatial  discretizations  result  in  low  numerical 
dissipation,  i.e.  the  correct  wave  speed  was  maintained  on  each  mesh.  This 
is  an  important  property  for  long-time  integration  of  unsteady  flows  as  dis¬ 
cussed  in  the  Introduction.  Second,  spectral  accuracy  is  achieved  for  rapidly 
varying  solutions  as  long  as  the  solution  is  resolved  adequately  on  the  scale  of 
a  single  element.  These  properties  make  spectral  elements  ideally  suited  for 
solving  the  equations  governing  incompressible  fluid  dynamics,  where  similar 
phenomena  appear  as  boundary  layers  and  shear  layers.  Local  mesh  refine¬ 
ment  was  a  simple  matter  in  this  one-dimensional  example,  but  for  more 
interesting  two-  and  three-dimensional  problems  it  becomes  one  of  the  most 
important  features  of  the  discretization. 

3  Multi-Dimensional  Problems 

3.1  Basis  functions  in  d-dimensions 

A  key  to  the  efficiency  of  high-order  methods  in  two-  and  three-dimensional 
problems  is  the  formation  of  a  basis  from  the  tensor  product  of  one-dimensional 
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N 


Fig.  2.5.  Numerical  integration  of  the  viscous  Burgers  equation:  (a)  evolution  of 
\du/dx\x=o  for  three  different  meshes  and  (b)  reduction  of  the  error  in  max(|<9u/9x|) 
with  increasing  polynomial  order  N. 
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functions.  Among  other  things,  this  allows  the  computation  of  integrals  and 
derivatives  of  the  basis  functions  to  be  simplified  through  a  procedure  called 
sum  factorization  [65].  It  also  contributes  to  the  sparse  structure  of  matrix 
systems  for  multi-dimensional  problems. 

In  this  section  we  describe  the  procedure  for  constructing  an  efficient, 
high-order  basis  on  two-  and  three-dimensional  domains.  To  keep  the  discus¬ 
sion  simple,  we  only  consider  the  standard  domains  Rd  and  Td,  where  d  is 
the  problem  dimension.  Figure  3.1  defines  the  standard  rectangle,  i?2,  and 
Fig.  3.2  defines  the  standard  triangle,  T2.  “Standard”  here  means  that  the 
coordinates  are  normalized  to  fall  in  the  range  —1  to  1.  For  d  =  3,  the  stan¬ 
dard  domain  is  a  hexahedral  or  tetrahedral  element.  Isoparametric  mappings 
can  always  be  used  to  transform  more  general  elements  to  these  standard  do¬ 
mains,  as  illustrated  in  Fig.  3.1.  On  the  standard  element,  we  wish  to  define  a 
polynomial  basis,  denoted  by  (j>ij  (£i ,  £2 ) ;  so  that  we  can  represent  a  function 
u>l(£u&)  by  the  expansion 

N  N 

uh(£i,6)  - 

i=0  j= 0 

where  Ujj  is  the  coefficient  of  the  basis  function  4>ij  and  £  =  (£1,62)  is  the 
local  coordinate  within  the  element. 


Fig.  3.1.  Definition  of  the  standard  quadrilateral  domain  R2 .  General  curvilinear 
elements  can  always  be  mapped  back  to  the  standard  element  as  shown. 


For  quadrilateral  (two-dimensional)  and  hexahedral  (three-dimensional) 
elements,  the  procedure  is  straightforward.  For  example,  on  the  domain  Qk  = 
R2 ,  the  basis  would  be 


1,6)  =  </>i(£i)  <^(6), 

where  is  the  one-dimensional  GLL  polynomial  defined  in  §  2.  In  this 
case,  Uij  represents  the  function  value  at  the  node  £4j-  .  The  three-dimensional 
basis  on  R3  is  exactly  analogous  to  this  one. 
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Fig.  3.2.  Definition  of  the  standard  triangular  domain  T2.  Here  r  =  £1  and  s  =  &■ 


To  introduce  the  expansion  basis  for  the  standard  triangle  T2,  we  first 
need  to  define  a  basic  coordinate  mapping  as  illustrated  in  Fig.  3.3.  The 
rectangular  domain  R 2  can  be  mapped  into  the  triangular  domain  T2  by  the 
transformation: 

m  =  i(i +  &)(!- 6) -i,  (31) 

The  triangular  basis  is  now  partitioned  into  interior  modes  and  boundary 
modes.  Interior  modes  are  zero  on  the  boundary  of  the  triangular  domain, 
similar  to  the  bubble  modes  used  in  p- type  finite  element  methods  [6,64]. 
Boundary  modes  can  be  further  partitioned  into  vertex  and  edge  modes.  Ver¬ 
tex  modes  vary  linearly  from  zero  to  one  along  the  edge  of  the  triangle.  Edge 
modes  are  only  non-zero  along  a  single  edge  of  the  triangle,  and  are  zero 
along  the  other  edges  and  at  all  vertices. 

Using  the  notation  shown  in  Fig.  3.3  and  recalling  that  refers  to 

the  Jacobi  polynomial,  we  can  write  the  triangular  basis  as  follows: 

-  Vertex  modes 

«rtexA  =  j(l- «,)(!- 6). 

^rtexB  =  ](!  +  £.)(! -6), 

°  =  1(1  +£,); 

-  Edge  modes  (i  >  2,  j  >  1;  i  <  L,  i  +  j  <  L) 

4dge  1  =  2^2(1  +  £i)(i  -  6)(1  -  bYP&iti), 

0edge  2  =  I(i  +  Cl)(i  _  6)(i  +  ^(6), 

^edge  3  _  I(1  _  6)(1  _  6)(1  +  6)i  P^\(6); 
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-  Interior  modes  (i  >  2,j  >  1  ;i  <  L,i  +  j  <  L) 

^mteri°r=l(1  +  £i)(1_5i)pU({l)x 

^T(i  +  6)(l-6)'.ff7‘,1(?2> 

The  indices  ij  refer  to  the  principle  polynomial  in  the  £1  and  £2  direction. 
L  denotes  the  total  number  of  modes  associated  with  each  direction,  i.e.  the 
maximum  polynomial  order  along  an  edge  is  N  =  L  —  1.  For  example,  if 
L  =  2  there  are  only  vertex  modes,  giving  us  a  linear  finite  element  basis. 

This  is  a  polynomial  basis  in  both  the  r)  and  £  coordinates.  In  the  £  coor¬ 
dinate  system  it  forms  a  tensor  product,  so  basic  operations  such  as  integra¬ 
tion  and  differentiation  can  be  performed  using  equivalent  one-dimensional 
operations  just  like  the  tensor  product  basis  on  Rd.  It  also  accommodates 
exact  Gauss- Jacobi  quadrature  and  maintains  a  partial  orthogonality  be¬ 
tween  the  modes.  This  partial  orthogonality  helps  keep  the  matrices  formed 
from  inner  products  of  the  basis  functions  sparse.  More  details  about  the 
two-dimensional  basis  can  be  found  in  [27,75]. 

The  two-dimensional  mapping  is  the  foundation  for  constructing  a  coor¬ 
dinate  system  in  the  tetrahedral  domain  T3,  starting  from  the  coordinate 
system  for  the  hexahedral  domain  i?3.  Figure  3.4  shows  how  R?  is  reduced 
to  T3  by  applying  the  coordinate  transformation  given  in  (3.1)  to  each  pair 
of  coordinates.  The  inverse  mapping  is 

£1  =  -2(l  +  fji)/(772  +m)  - 1, 

£2  =  2(1 +  i?2)/(l -%)-!,  (3.2) 

£3  =  f?3- 

For  773  =  —1,  we  recover  the  two-dimensional  mapping.  The  three-dimensional 
basis  for  T3  is  then  decomposed  into  vertex  modes,  edge  modes,  face  modes 
and  interior  modes,  in  analogy  with  the  basis  on  T 2;  details  can  be  found  in 

[74]. 

In  the  remainder  of  this  Chapter  we  will  use  the  following  simplified  nota¬ 
tion.  Every  index  (ijk)  in  the  tensor  product  basis  will  be  mapped  to  a  single 
number  as  p  =  i  4-  jN  +  kN2,  so  there  is  a  one-to-one  correspondence  be¬ 
tween  </>p(£)  and  <j>ijk  (£)•  This  hides  the  tensor  product  nature  of  the  basis  but 
makes  the  discrete  equations  much  easier  to  write  down.  When  necessary,  we 
can  “unroll”  the  p  index  to  take  advantage  of  the  tensor  product  form.  This 
expression  for  p  is  valid  for  quadrilateral  elements  only;  a  modified  expression 
should  be  used  with  the  triangular  domains. 

3.2  Data  structures 

Here  we  describe  the  data  structures  and  basic  operations  required  to  imple¬ 
ment  the  most  common  procedures  in  spectral  element  methods.  We  cover: 
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o=-i  <t>=0  <J*=1 


Fig.  3.3.  Schematic  of  the  transformation  from  R2  to  T2 . 


Fig.  3.4.  Schematic  of  the  transformation  from  R3  to  T3. 
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representation  of  the  global  system,  how  to  transfer  global  data  to  local  (el¬ 
ement)  data,  direct  stiffness  summation,  and  finally  the  procedures  for  in¬ 
tegration  and  differentiation  of  solutions  defined  on  geometrically  complex 
two-  and  three-dimensional  elements. 


Implementation  First  we  start  with  the  representation  of  the  solution 
within  a  computer  program.  In  this  section  we  give  several  examples  as 
pseudo-code  fragments  that  follow  basic  C  and  C++  syntax.  This  is  not 
meant  to  be  an  in-depth  presentation,  but  simply  an  illustration  of  the  most 
important  ideas  and  the  basic  approach. 

In  spectral  element  methods,  as  in  finite  element  methods,  global  data  is 
stored  as  a  flat,  unstructured  array.  The  basic  data  structure  used  to  relate  the 
mesh  to  entries  in  this  array  is  a  table  that  identifies  the  global  node  number 
of  a  local  node  within  each  element.  Since  we  are  interested  in  both  nodal 
and  modal  descriptions,  we  replace  “node”  with  the  more  general  concept 
of  a  “degree  of  freedom”  in  the  global  solution.  The  table  of  indices  can  be 
stored  as  a  two-dimensional  array  of  integers: 

map  [k]  [i]  =  global  index  of  local  datum  i 
in  element  k. 


Local  data  can  be  stored  in  any  convenient,  regular  format.  In  our  first 
version,  we  will  assume  the  number  of  degrees  of  freedom  in  the  mesh  (ndof ) 
and  the  number  of  degrees  of  freedom  associated  with  each  element  (edof) 
are  constant.  To  perform  some  global  operation,  for  example  to  evaluate  a 
function  v  =  F(u),  we  insert  a  layer  of  indirection  between  the  unstructured 
global  data  and  the  structured  local  data.  The  following  is  a  template  for  any 
such  computation: 


for  (i=0;  i  <  ndof;  i++) 
v[i]  =  0. ; 

for  (k=0;  k  <  nel;  k++)  { 
for  (i=0;  i  <  edof;  i++) 
uk[i]  =  u[  map[k]  [i]  ]; 
compute  (uk,  vk) ; 
for  (i=0;  i  <  edof;  i++) 
v  [  map[k]  [i]  ]  +=  vk[i]  ; 


//  Initialize  v 

//  Loop  over  elements 
//  Copy  global  data 
//  —  gather 

//  Compute  v=F(u)  locally 
//  Accumulate  the  result 
//  —  scatter 


Depending  on  the  specific  operation,  the  final  result  may  need  to  be  cor¬ 
rected  in  some  way:  rescaled  with  the  global  mass  matrix,  averaged  based  on 
the  data  multiplicity,  or  some  similar  global  operation.  The  last  loop  corre¬ 
sponds  to  direct  stiffness  summation,  and  in  our  matrix  notation  we  would 
write  this  same  operation  as: 

K  K 

V  =  =  ^'F(u*)  =  F(  u). 

fc=l  k=l 


(3.3) 
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To  make  this  data  structure  suitable  for  both  hierarchical  bases  and  non- 
conforming  elements  (to  be  developed  in  §  3.5),  we  introduce  two  generaliza¬ 
tions.  First,  we  allow  the  number  of  degrees  of  freedom  in  each  element  to  be 
different  by  replacing  the  constant  edof  with  the  array  edof  [k] .  Second,  we 
allow  each  local  degree  of  freedom  to  depend  on  an  arbitrary  combination  of 
the  global  degrees  of  freedom.  To  implement  this  we  need  to  introduce  two 
new  arrays: 


idof  [k]  [i]  =  number  of  global  dependencies  for 
local  datum  i  in  element  k, 
combine  [k]  [i]  =  array  of  coefficients  for  combining 
global  data  to  get  local  data. 

And  finally,  we  need  to  add  a  new  dimension  to  our  index  table: 

map[k]  [i]  [j]  =  global  index  of  the  jth  dependency 
of  local  datum  i. 

In  effect,  we  are  introducing  a  set  of  coefficient  matrices  Zk  that  define  a 
general  transformation  between  global  and  local  degrees  of  freedom.  Using 
this  approach,  the  global  initialization,  loop  over  the  elements,  and  function 
call  for  the  local  computation  shown  above  stay  the  same,  but  the  procedure 
for  constructing  the  local  data  is  re-written  as  follows: 

for  (i=0;  i  <  edof[k];  i++)  //  Initialize 

uk[i]  =  0. ; 

for  (i=0;  i  <  edof [k] ;  i++)  {  //  Combine 
real  *Z  =  combine [k] [i] ; 
for  (j=0;  j  <  idof [k]  [i] ;  j++) 

uk[i]  +=  Z[j]  *  u[  map[k]  [i]  [j]  ]; 

> 

Likewise,  the  accumulation  of  results  uses  a  similar  method  for  combining 
local  contributions  to  the  global  degrees  of  freedom: 

for  (i=0;  i  <  edof[k];  i++)  {  //  Combine 
real  *Z  =  combine  [k]  [i] ; 
for  (j=0;  j  <  idof [k] [i] ;  j++) 

v [  map [k]  [i]  [j]  ]  +=  Z[j]  *  vk[i]  ; 

> 


We  also  introduce  a  new  matrix  notation  for  this  more  general  approach. 
Since  the  local  data  is  Zk u,  and  the  local  contribution  to  the  global  system  is 
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[Zk]Tvk ,  the  equivalent  procedure  for  assembling  the  global  system  is  written 
as: 


K  K 

V  =  ]T'[zy  V*  =  Y^[Zk]TF{Zk  u)  =  F(vl)  (3.4) 

*=1  k= 1 


Compare  this  to  (3.3)  above,  and  note  that  the  only  change  is  how  we  trans¬ 
form  between  the  local  and  global  systems.  The  actual  computations  at  both 
the  local  and  global  level  are  the  same. 

In  the  remaining  sections  we  will  describe  computations  in  terms  of  either 
the  local  or  global  system,  omitting  the  actual  “assembly”  required  to  go  be¬ 
tween  them.  Equation  (3.4)  is  always  implied  as  the  method  for  recovering 
local  solutions  and  assembling  global  ones.  This  simplifies  what  would  oth¬ 
erwise  become  a  confusing  barrage  of  notation.  Along  the  way  we  will  give 
more  specific  information  about  how  the  coefficients  for  the  mapping  matrix 
Zk  are  chosen.  This  is  a  very  flexible  scheme  for  storing  the  global  solution 
and  reconstructing  the  local  one.  The  additional  storage  and  computational 
overhead  is  simply  the  price  we  pay  for  new  capabilities:  variable  order  of  the 
local  basis  functions  and  arbitrary  connectivity  in  the  mesh.  However,  these 
are  the  key  ingredients  for  adaptive  h-p  refinement  techniques! 


Improvements  Although  the  scheme  outlined  above  is  complete,  it  is  not  an 
efficient  way  to  implement  h-p  methods:  too  much  of  the  addressing  is  done 
by  indirection.  One  of  the  computational  advantages  of  high-order  elements  is 
the  natural  partitioning  of  data  into  sets  that  can  be  operated  on  as  a  group. 
For  example,  local  degrees  of  freedom  are  normally  partitioned  into  several 
groups:  vertices,  edges,  faces,  and  interior  data.  Data  associated  with  any  of 
these  groups  can  be  operated  on  as  a  single  entity.  For  example,  all  the  points 
on  the  interior  of  an  element  can  be  identified  with  the  element  number  and 
moved  around  or  computed  on  as  a  single  unit.  High-order  elements  provide 
better  data  locality  than  low-order  elements  because  computations  always 
involve  large  amounts  of  data  that  can  be  grouped  together  in  memory. 

The  type  of  full  indirection  outlined  above  is  only  necessary  for  the  degrees 
of  freedom  associated  with  the  surface  of  an  element.  These  data  make  up  the 
loosely-coupled  components  of  the  global  system.  This  sparse  global  system 
forms  the  “skeleton”  of  the  discretization  and  shares  many  characteristics 
with  low-order  finite  elements.  For  example,  the  numbering  system  stored 
in  the  index  table  can  be  optimized  to  reduce  its  algebraic  bandwidth  using 
the  same  techniques  applied  in  finite  element  methods  (see  Sect.  3.6).  Unfor¬ 
tunately,  more  sophisticated  data  structures  than  can  be  described  here  are 
required  to  incorporate  these  simplifications;  we  leave  this  to  the  reader  as  an 
important  step  in  the  efficient  implementation  of  spectral  element  methods. 


Adaptive  Spectral  Element  Methods  249 


3.3  Basic  operations 

Integration  The  general  form  for  the  evaluation  of  an  integral  by  Gaussian 
quadrature  with  weights  (1  —  £)"(1  +  £)&  can  be  written  as 

r  (i  -  o“(i + oMa  ^  =  E  PiMs*), 

J~1  i—0 


where  and  p“>3  are  the  quadrature  points  and  weights  associated  with 
the  Jacobi  polynomial  The  quadrature  rule  is  exact  if  u(£)  is  a 

polynomial  of  degree  2 N  +  1  for  the  Gauss  points,  2 N  for  the  Gauss-Radau 
points,  and  2 N  —  1  for  the  Gauss-Lobatto  points. 

To  integrate  a  function  defined  over  the  standard  domain  R2,  we  simply 
use  the  tensor  product  form  to  reduce  the  integral  to  two  one-dimensional 
quadratures.  The  integral  of  a  general  function  is  written  as 


N  N 

u(fldfid6  =  EE  PiPMtij)- 

i=0  j— 0 

The  extension  to  integrals  over  R3  is  straightforward. 

On  the  triangular  domain,  we  use  a  coordinate  transformation  to  simplify 
the  integral.  The  integral  of  a  function  over  T2  becomes 


/  u(l ?)  drjidr]2  =  \ 
Jt2  Jr2 


u($)\J\dtUdb, 


where  \J\  =  (1  -  fy)/2  is  the  Jacobian  determinant  of  the  transformation 
rj  ->  The  integral  in  space  can  now  be  evaluated  just  as  the  integral  over 
R2.  To  include  the  Jacobian,  we  use  a  quadrature  rule  with  a  =  0,  f3  =  0  in 
the  fy-direction,  and  a  quadrature  rule  with  a  =  1,  /?  =  0  in  the  ^-direction. 
Integration  over  T3  is  performed  in  a  similar  way. 


Projection  To  apply  the  integration  rules  described  above,  we  need  to  eval¬ 
uate  a  function  at  a  given  set  of  quadrature  points.  For  the  nodal  basis  this  is 
trivial  because  the  basis  coefficients  are  the  function  values  at  the  quadrature 
points.  For  a  modal  basis  we  need  an  efficient  way  to  evaluate  the  full  solu¬ 
tion  at  the  quadrature  points.  This,  and  the  related  problem  of  determining 
the  modal  expansion  coefficients  from  a  set  of  nodal  values,  are  both  called 
projections. 

A  projection  is  the  procedure  for  determining  the  coefficients  Uijk  so  that 
uh  «  u  for  some  given  function  u.  First,  recall  the  general  form  of  the  expan- 

«(£)  ~  uh(£)  =  E  up  <M£)- 

P 


sion: 
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The  expansion  coefficients  are  determined  by  taking  the  inner-product  with 
the  basis  functions  on  both  sides  of  this  equation: 

{u,(f)p)nk  =  ( u  i<j>p)ok  (3-5) 

Solving  this  system  of  equations  to  determine  the  approximation  uh  is  straight¬ 
forward  if  the  basis  {<f>ijk}  is  orthogonal.  Otherwise,  we  have  to  compute  uh 
by  inverting  a  matrix. 

To  describe  this  for  the  modal  basis,  we  introduce  the  following  notation: 

Up  =  vector  of  P  ~  N3  expansion  coeffi¬ 
cients,  Up  <—  mjk ; 

u q  =  vector  of  Q  function  values  at  the 
quadrature  points,  ug  u(£g); 

W qq  =  diagonal  matrix  of  Q  x  Q  quadrature 
weights  required  to  integrate  a  func¬ 
tion  over  Qk%, 

Bgp  =  rectangular  matrix  containing  the 
value  of  the  basis  functions  at  the 
quadrature  points  ( Q  quadrature 
points  x  P  basis  functions). 

Now  we  can  write  down  the  algebraic  form  of  the  inner-products  given  in  (3.5). 
First,  the  inner  product  of  u  with  the  basis  functions: 

{u,<t>P)nk  -t  BtW  u. 

Second,  the  inner  product  of  uh  with  the  basis  functions: 

{uh,cf>p)nk  — >  BtWBu. 

The  approximation  uh  «  u  is  determined  by  matching  these  two  inner  prod¬ 
ucts  for  every  basis  function: 

BtWu  =  BtWBu.  (3.6) 

This  is  the  fully  discrete  form  of  (3.5).  Note  that  the  epression  on  the  right- 
hand-side  defines  the  mass  matrix  (<t>i,<j>j)ni*  -t  BTWB,  or  simply  M  = 

btwb. 

Now  we  can  define  the  discrete  projection  operator  as 
u  =  V(u)  =  [BtWB]"1BtWu. 

This  is  also  called  the  forward  transform  of  a  function  from  physical  space 
(nodal  values)  to  transform  space  (modal  coefficients).  The  discrete  inverse 
transform  is  simply  the  evaluation  of  the  modal  basis  at  a  given  set  of  points: 

u  =  'P~1(u)  =  Bu. 


Adaptive  Spectral  Element  Methods  251 


Finally  we  note  that  in  the  GLL  nodal  basis,  M  is  a  diagonal  matrix.  This 
follows  directly  from  the  discrete  orthogonality  of  the  basis  functions  and  the 
fact  that  < 4>p(£q)  =  Spq,  where  £  are  the  GLL  quadrature  points.  A  diagonal 
mass  matrix  is  a  tremendous  simplification  since  multiplication  by  M_1  is 
trivial. 


Differentiation  Since  the  basis  is  formed  from  continuous  functions,  in 
principle  derivatives  can  be  evaluated  by  simply  differentiating  the  basis  func¬ 
tions: 

-g7~  - 
^  ijk 

In  practice  we  only  need  the  derivatives  at  certain  points,  namely  the  quadra¬ 
ture  points.  Therefore,  the  solution  is  first  transformed  onto  an  equivalent  La- 
grangian  interpolant  basis  defined  over  the  quadrature  points.  We  introduce 
the  one-dimensional  Lagrangian  derivative  matrix 


Djp 


_  d<f>p 


Rather  than  0(N3)  terms,  the  Lagrangian  interpolant  basis  reduces  the  sum¬ 
mation  to  an  equivalent  one-dimensional  operation.  The  coefficient  of  the 
derivative,  u'i]k ,  is  then  given  by 


N 

uijk  =  ®iPuPjk- 

p= 0 

Since  only  O(N)  operations  are  required  per  point,  it  takes  0(N 3)  operations 
to  compute  all  derivatives  in  R2  or  T2,  and  0{Ni)  operations  to  compute 
all  derivatives  in  R3  or  T3.  In  the  modal  basis,  calculation  of  derivatives  is 
preceded  by  an  inverse  transform  (to  nodal  values)  and  followed  by  a  forward 
transform  (to  modal  coefficients),  therefore  increasing  the  computational  cost. 


3.4  Spaces  and  norms 

Throughout  the  rest  of  these  notes  we  will  be  concerned  primarily  with  two 
function  spaces  1*2(12)  and  ff1(/2).  We  define  the  inner-product  of  two  func¬ 
tion  u  and  v  as: 

{u,v)  =  /  uvdf}.  (3.7) 

Jn 

For  reference  we  define  the  1*2  norm  as: 

|M|  =  (uju)1/2  Vu  6  L2(I2), 


(3.8) 
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the  H 1  norm  as: 

||it||i  =  [(u,u)  +  (u^u.i)]1/2  Vu£i?1(l ?),  (3.9) 

and  the  infinity  norm  as: 

I W| |oo  =  sup  |iz(ai)|  Vn  €  L^Q).  (3.10) 

xen 

For  the  discrete  solution,  (3.8)  and  (3.9)  can  be  evaluated  approximately 
by  numerical  quadrature;  the  infinity  norm  can  be  estimated  from  the  basis 
coefficients. 


3.5  Global  matrix  operations 

Conforming  One  of  the  basic  principles  for  maintaining  the  sparse  struc¬ 
ture  in  the  global  matrix  systems  is  to  enforce  only  the  minimum  continuity 
between  elements.  For  all  of  the  problems  we  consider  here,  the  global  basis  is 
required  to  be  C°  continuous,  i.e.  only  function  values  and  not  derivatives  are 
required  to  be  globally  continuous.  For  discretizations  with  both  Lagrangian 
and  h-p  basis  functions,  this  is  accomplished  by  choosing  a  unique  set  of 
global  “degrees  of  freedom”  that  define  the  approximation  space. 

Global  continuity  in  the  Lagrangian  basis  is  straightforward.  Since  the  ba¬ 
sis  functions  are  defined  as  the  Lagrangian  interpolant  through  the  elemental 
nodes,  we  only  have  to  use  the  same  set  of  nodes  along  the  edge  of  adjacent 
elements.  As  long  as  the  elements  are  conforming  (each  edge  matches  up  ex¬ 
actly  to  one  other  edge)  and  of  equal  order  (same  number  of  nodes  along 
each  edge),  C°  continuity  is  guaranteed.  Figure  3.5  shows  a  possible  global 
numbering  scheme  for  a  simple  quadrilateral  mesh. 

Continuity  in  the  modal  basis  is  more  involved  because  we  have  to  match 
up  all  modes.  Depending  on  the  orientation  chosen  for  the  triangular  ele¬ 
ments,  local  modes  may  be  a  positive  or  negative  image  of  the  corresponding 
global  mode.  This  extra  bit  of  information  must  be  tracked  as  part  of  the 
implementation,  and  we  describe  it  as  one  use  of  the  mapping  matrix  Zk. 

Consider  a  domain  made  up  of  two  triangular  elements  as  shown  in  Fig.  3.6. 
The  expansion  order  is  N  =  3,  meaning  there  are  six  modes  on  each  triangle: 
three  vertex  modes  (1,  3,  5)  and  three  edge  modes  (2,  4,  6),  but  no  interior 
mode.  The  number  of  local  degrees  of  freedom  for  each  element  is  neof  =  6, 
and  the  number  of  global  degrees  of  freedom  for  the  mesh  shown  is  nd0f  =  9. 
The  mapping  from  global  to  local  degrees  of  freedom  for  element  Ql  is: 


k 

'1 

k 

«r 

U2 

1 

U2 

u3 

1 

U3 

1(4 

1 

«4 

U5 
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U5 
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1 
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or  in  short  form  u*  =  Zku.  Notice  that  data  for  each  local  mode  maps  to 
one  and  only  one  global  mode,  but  data  for  a  global  mode  can  be  shared  by 
any  number  of  local  modes.  The  number  of  local  modes  that  contribute  data 
to  one  global  mode  is  called  the  multiplicity  of  the  global  mode. 


boundary  unknowns 


Fig.  3.5.  Local  and  global  numbering  for  a  simple  domain  composed  of  two  quadri¬ 
lateral  elements  of  order  N  =  2.  Points  along  the  boundary  do  not  constitute  global 
“degrees  of  freedom”  and  are  not  assigned  indices  in  the  global  index  set. 


Fig.  3.6.  Local  and  global  numbering  for  a  domain  containing  two  triangular  ele¬ 
ments.  Here  the  expansion  order  is  N  =  3  so  there  are  N(N  +  l)/2  =  6  modes  in 
each  element:  three  vertex  modes  (1,  3,  5),  and  three  edge  modes  (2,  4,  6). 


Nonconforming  An  important  extension  to  the  original  spectral  element 
method  was  the  introduction  of  nonconforming  elements  by  Bernardi  et  al. 
[14].  Here  we  give  only  a  sketch  of  the  how  the  method  is  used  to  patch 
together  a  nonconforming  mesh;  for  a  full  description  of  the  method,  including 
efficient  solution  techniques  and  numerous  examples,  see  the  references  [2, 14, 
39,40,59], 
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The  main  idea  is  to  use  a  constrained  approximation.  For  a  geometrically 
and  functionally  nonconforming  set  of  elements,  we  cannot  guarantee  global 
C°  continuity  of  the  basis.  Therefore,  we  make  the  basis  as  continuous  as 
possible  by  minimizing  the  difference  in  function  values  across  each  noncon¬ 
forming  interface.  We  do  this  by  enforcing  the  following  weighted  residual 
equation: 

J  (u  -  v)ipds  =  0  Vt/>  e  PN-2(r)-  (3.11) 

The  residual  is  the  difference  in  two  functions  u  and  v  that  we  would  like  to 
be  continuous,  and  ip  is  the  weight  used  to  perform  the  minimization.  The 
algebraic  form  of  this  equation  is 


u  =  Z  v, 

where  u  and  v  are  the  coefficients  of  whatever  basis  we  choose  to  represent  u 
and  v,  and  the  entries  of  Z  are  determined  by  evaluating  the  residual  equation 
using  numerical  quadrature.  We  say  the  values  of  v  are  free  and  the  values 
of  u  are  constrained  to  match  them  such  that  (3.11)  is  satisfied. 

To  use  this  as  a  computational  tool,  we  choose  v  to  be  the  solution  along 
the  edge  of  some  element,  and  u  to  be  the  solution  along  the  edge  of  an  adja¬ 
cent  nonconforming  element.  Equation  (3.11)  is  used  to  construct  u  from  v, 
thereby  eliminating  u  as  an  “unknown”  in  the  mesh.  Since  v  contributes  to 
the  global  degrees  of  freedom  in  the  problem,  this  is  one  type  of  the  “combin¬ 
ing”  described  in  §  3.2.  There  is  an  additional  consistency  error  associated 
with  the  nonconforming  discretization  because  the  approximation  space  is  no 
longer  a  proper  subset  of  the  solution  space — it  admits  discontinuous  solu¬ 
tions.  As  bad  as  this  sounds,  the  consistency  error  is  of  the  same  order  as 
other  components  of  the  approximation  error,  and  if  implemented  properly 
the  method  always  converges  to  a  continuous  solution  if  one  exists. 

Nonconforming  elements  allow  quadrilateral  meshes  to  be  refined  locally, 
without  the  conforming  restriction  propagating  refinement  across  the  mesh.  It 
is  not  as  important  for  triangular  and  tetrahedral  elements  where  algorithms 
such  as  Rivara  refinement  [66]  can  be  used  to  perform  local  refinement  and 
maintain  consistency  in  the  mesh.  We  will  give  several  examples  that  make 
use  of  nonconforming  quadrilateral  elements  in  the  following  sections. 

3.6  Solution  techniques 

In  this  section  we  will  describe  efficient  iterative  and  direct  methods  for  invert¬ 
ing  the  large  algebraic  systems  that  result  from  nonconforming  spectral  ele¬ 
ment  discretizations.  Iterative  methods  are  more  appropriate  for  steady-state 
calculations  or  calculations  involving  variable  properties,  such  as  a  changing 
time  step  or  a  Helmholtz  equation  with  a  variable  coefficient.  For  direct  meth¬ 
ods  the  issue  is  one  of  memory  management  —  storing  A  as  efficiently  as  pos¬ 
sible  without  sacrificing  the  performance  needed  for  fast  back-substitution. 
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The  development  of  fast  direct  and  well-preconditioned  iterative  solvers  rep¬ 
resents  a  major  advance  towards  the  application  of  nonconforming  spectral 
element  methods  to  the  simulation  of  turbulent  flows  on  unstructured  meshes. 


Conjugate  gradient  iteration  Conjugate  gradient  methods  [11]  have  been 
particularly  successful  with  spectral  elements  because  the  tensor-product 
form  and  local  structure  allows  the  global  Helmholtz  inner  product  to  be 
evaluated  using  only  elemental  matrices.  To  solve  the  system  Au  =  F  by  the 
method  of  conjugate  gradients  we  use  the  following  algorithm: 

k  =  0;  u0  =  0;  r0  =  F; 
while  Tk  7^  0 

Solve  M  qk  —  Tk  ;  k  =  k  +  1 
if  k  =  1 


Pi 

=  qo 

else 

Pk 

=  rl-iQk- 

-l/ff_2?k-2 

Pk 

=  Qk-l  + 

PkPk-1 

end 

Olk  = 

T 

rL 

iQk-i/Pk 

A-Pk 

Tk  =  rk- i  -  oik  A pk 
uk  =  Uk-i  +  akPk 

end 

u  =  uk 


where  k  is  the  iteration  number,  r*,  is  the  residual,  and  pk  is  the  current  search 
direction.  The  matrix  M  is  a  preconditioner  used  to  improve  the  convergence 
rate  of  the  method  and  is  discussed  in  detail  next. 

Selection  of  a  good  preconditioner  is  critical  for  rapid  convergence;  the 
preconditioner  must  be  spectrally  close  to  the  full  stiffness  matrix  yet  easy 
to  invert.  Popular  preconditioners  for  spectral  methods  include  incomplete 
Cholesky  factorization  and  low-order  (finite  element,  finite  difference)  approx¬ 
imations  [26,65].  Unfortunately,  these  preconditioners  can  be  as  complicated 
to  construct  for  an  unstructured  mesh  as  the  full  stiffness  matrix  A.  Next  we 
present  three  preconditioners  which  are  simple  to  build  and  apply  even  when 
the  mesh  is  unstructured. 

In  conjugate  gradient  methods  the  number  of  iterations  required  to  reach 
a  given  error  level  scales  as  This  is  only  an  estimate,  since  the  actual 
convergence  rate  is  determined  by  the  distribution  of  eigenvalues  —  if  all 
of  A’s  eigenvalues  are  clustered  together,  convergence  is  much  faster.  To 
assess  the  effectiveness  of  a  given  preconditioner  we  begin  by  looking  at  the 
condition  number  of  M_1  A. 

Each  of  the  following  methods  is  based  on  selecting  a  subset  of  entries  from 
the  full  stiffness  matrix.  The  first  two  preconditioners  are  diagonal  matrices 
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Fig.  3.7.  Conjugate  gradient  iteration  convergence  history  for  a  Helmholtz  equation 
with  A2  =  1:  •  =  none,  A  =  diagonal,  y  =  row-sum,  and  □  =  block-diagonal 
preconditioner. 


Table  3.1.  Condition  numbers  of  M  1 A  for  a  Helmholtz  equation  with  A2  =  1. 


N  None  Diagonal  Row-Sum  Block-Diagonal 
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given  by 

Mu  =  An  “diagonal”,  (3.12) 

fl-dof 

Mu  =  ^  |  Aij |  “row-sum”,  (3.13) 

j= o 

where  ridof  =  rank(A);  the  diagonal  (3.12)  is  sometimes  called  a  point  Jacobi 
preconditioner.  Both  are  direct  estimates  of  the  spectrum  of  A,  and  have  the 
advantage  of  minimal  storage  and  work.  They  can  be  quite  effective  for  diag¬ 
onally  dominant  systems  such  as  the  viscous  correction  step  of  the  splitting 
scheme  described  in  §  5.  The  third  preconditioner  is  a  block-diagonal  matrix: 


f 

if  i  <  nh0 f,  j  =  i 

o 

-v- 

II 

sf 

if  *  <  ^bof ,  j  ^  f 

l  A-ij 

otherwise 

where  nt,0f  is  the  number  of  mortar  nodes  in  the  mesh.  The  structure  of 
this  matrix  assumes  that  A  is  arranged  in  the  static  condensation  format 
described  in  Sect.  3.6.  Applying  this  preconditioner  amounts  to  storing  and 
inverting  the  isolated  blocks  of  A  associated  with  the  degrees  of  freedom  on 
the  interior  of  each  element,  while  applying  a  simple  diagonal  matrix  to  the 
mortar  nodes. 

The  following  test  examines  the  iterative  solution  to  a  Helmholtz  equa¬ 
tion  for  the  two  extreme  cases  A2  =  1  and  A2  =  10  000.  Convergence  is 
measured  with  respect  to  the  solution  u(x,y)  =  sin7rxsin  ixy.  The  mesh  has 
K  —  10  elements  generated  by  recursively  subdividing  a  square  domain, 
with  N  =  15  in  each  element.  Figures  3.7  and  3.8  show  the  convergence 
history  for  the  weakly  and  strongly  diagonally  dominant  systems.  The  dif¬ 
ference  in  convergence  rates  is  explained  in  part  by  the  condition  numbers 
of  M_1  A,  given  in  Tab.  3.1  and  Tab.  3.2.  In  spite  of  yielding  a  lower  ka, 
the  row-sum  preconditioner  converges  slower  and  therefore  offers  no  partic¬ 
ular  advantage  over  the  simpler  diagonal  preconditioner.  The  block-diagonal 
matrix  performs  significantly  better  than  the  other  two,  effectively  doubling 
the  convergence  rate  in  both  cases.  This  preconditioner  is  fully  parallelizable, 
and  offers  the  most  promise  in  distributed  computing  environments  where 
the  cost  per  iteration  can  include  significant  time  performing  interprocessor 
communication;  its  main  drawbacks  are  the  higher  operation  count  and  stor¬ 
age  requirement.  The  methods  described  in  the  next  section  for  implementing 
fully  direct  solvers  can  also  be  used  to  reduce  the  storage  requirement  for  the 
block-diagonal  preconditioner. 

We  conclude  this  section  by  giving  the  memory  requirements  and  com¬ 
putational  complexity  for  a  preconditioned  conjugate  gradient  (PCG)  solver. 
Since  the  elemental  Helmholtz  operator  can  be  evaluated  using  only  the  one¬ 
dimensional  Lagrangian  derivative  matrix,  the  required  memory  is  simply 
storage  for  the  nodal  values  and  geometric  factors: 
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Fig.  3.8.  Conjugate  gradient  iteration  convergence  history  for  a  Helmholtz  equation 
with  A2  =  10000:  •  =  none,  A  =  diagonal,  v  =  row-sum,  and  □  =  block-diagonal 
preconditioner. 


Table  3.2.  Condition  numbers  of  M  1 A  for  a  Helmholtz  equation  with  A2  =  10  000. 


N  None  Diagonal  Row-Sum  Block-Diagonal 
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As  mentioned  above,  the  dominant  numerical  operations  are  vector-vector 
and  matrix-vector  products,  although  derivative  calculations  are  folded  into 
a  more  efficient  matrix-matrix  multiplication.  The  operation  count  for  the 
entire  solver  is 

<7/  =  J£  [ci KN3  +  c2KN 2  +  c3KN]  ,  (3.16) 

where  Je  <x  VKN3  is  the  number  of  iterations  required  to  reach  a  given 
error  level  e.  Our  numerical  results  (Tables  3.1  and  3.2)  show  that  with  these 
preconditioners  Je  is  still  proportional  to  KN3,  but  the  constant  is  reduced. 
The  block  matrix  operations  required  to  compute  the  elemental  inner  prod¬ 
ucts  provide  good  data  locality  and  can  be  coded  efficiently  on  both  vector 
processors  and  RISC  microprocessors. 


Static  condensation  The  static  condensation  algorithm  is  a  method  for 
reducing  the  complexity  of  the  stiffness  matrices  arising  in  finite  element 
and  spectral  element  methods.  Static  condensation  is  particularly  attractive 
for  unstructured  spectral  element  methods  because  of  the  natural  division 
of  equations  into  those  for  boundaries  (mortars)  and  element  interiors.  To 
apply  this  method  to  the  discrete  Helmholtz  equation,  we  begin  by  writing 
partitioning  the  stiffness  matrix  into  boundary  and  interior  points: 

■'4.11  Ai2 

A.21  A22 

where  An  is  the  boundary  matrix,  A12  =  [A2  i]T  is  the  coupling  matrix, 
and  A22  is  the  interior  matrix.  This  system  can  be  factored  into  one  for  the 
boundary  (mortar)  nodes  and  one  for  the  interior  nodes,  so  that  on  Dk : 


K 

Ufc 

K 

F„ 

Uj 

(3.17) 


[An  —  A2iA221Ai2]  uj  =  F  b  —  [A2iA22]Fj, 
A22  Uj  =  Fj  —  A2iu6. 


(3.18a) 

(3.18b) 


During  a  pre-processing  phase,  the  global  boundary  matrix  is  assembled  by 
summing  the  elemental  matrices, 

K 

An  =  ^^[An  —  A2iA221Ai2],  (3.19) 

k= 1 

and  prepared  for  the  solution  phase  by  computing  its  LU  factorization.  Equa¬ 
tion  (3.19)  may  also  be  recognized  as  the  Schur  complement  of  A22  in  A.  As 
part  of  this  phase  we  also  compute  and  store  for  each  element  the  inverse  of 
the  interior  matrix  [Aj2J]  and  its  product  with  the  coupling  matrix  [A2iA^2  ]. 
The  system  is  solved  by  setting  up  the  modified  right-hand  side  of  the  global 
boundary  equations,  solving  the  boundary  equations  using  back-substitution, 
and  then  computing  the  solution  on  the  interior  of  each  element  using  direct 
matrix  multiplication.  Because  the  coupling  between  elements  is  only  C°, 
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Fig.  3.9.  Static  condensation  form  of  the  spectral  element  stiffness  matrix.  The 
vector  <f>  =  Uf,  represents  the  boundary  (mortar)  solution,  while  u;  represents  the 
interior  solution. 


the  element  interiors  are  independent  of  each  other  and  on  a  multiprocessor 
system  this  final  stage  can  be  solved  concurrently. 

Figure  3.9  illustrates  the  structure  of  a  typical  spectral  element  stiffness 
matrix  factored  using  this  approach.  To  reduce  computational  time  and  mem¬ 
ory  requirements  for  the  boundary  phase  of  the  direct  solver,  we  wish  to  find 
an  optimal  form  of  the  discrete  system  corresponding  to  a  minimum  band¬ 
width  for  the  matrix  An.  This  is  complicated  by  the  irregular  connectivity 
generated  by  the  using  of  nonconforming  elements.  One  approach  to  band¬ 
width  optimization  is  to  think  of  the  problem  in  terms  of  finding  an  optimal 
path  through  the  mesh  that  visits  “nearest  neighbors.”  During  each  of  the  K 
stages  of  the  optimization,  an  estimate  is  made  of  the  new  bandwidth  that 
results  from  adding  one  of  the  unnumbered  elements  to  the  current  path. 
The  element  corresponding  to  the  largest  increase  is  chosen  for  numbering, 
resulting  in  what  is  essentially  a  Greedy  algorithm.  This  basic  concept  is  illus¬ 
trated  in  Fig.  3.10.  The  reduction  in  bandwidth  translates  to  direct  savings 
in  memory  and  quadratic  savings  in  computational  cost.  Note  that  standard 
methods  of  bandwidth  reduction  used  for  finite  elements,  e.g.  the  Reverse 
Cuthill-McKee  algorithm,  can  also  be  used,  although  they  only  need  be  ap¬ 
plied  to  the  boundary  system. 

The  search  for  an  optimal  numbering  system  can  be  accomplished  during 
preprocessing,  so  the  extra  work  has  no  impact  on  the  simulation  cost  and 
can  result  in  significant  savings.  Table  3.3  shows  the  results  of  bandwidth 
optimization  for  each  of  the  computational  domains  pictured  in  Fig.  3.11. 
For  computers  where  memory  is  a  limitation,  this  procedure  can  determine 
whether  an  in-core  solution  is  even  possible.  Other  simple  memory  optimiza- 
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Fig.  3.10.  Bandwidth  optimization  for  a  spectral  element  mesh:  (a)  computational 
domain,  (b)  connectivity  graph  and  (c)  an  optimal  path  for  numbering  the  boundary 
nodes  in  the  mesh.  Line  thickness  demonstrates  the  change  in  global  bandwidth  with 
each  step. 


Table  3.3.  Matrix  rank  and  optimized  bandwidth  of  three  complex-geometry  do¬ 
mains  representative  of  internal  and  external  flow  problems. 


Mesh 

K 

N  rank  original  optimized  savings 

riblets 

91 

9  1484 

1483 

250 

83% 

cylinder 

114  11  2416 

2406 

402 

83% 

half-cylinder  176 

7  2177 

2156 

399 

81% 
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Fig.  3.11.  Nonconforming  meshes  used  to  test  the  bandwidth  optimization. 


tions  include  storage  of  only  a  single  copy  of  the  interior  and  coupling  matrices 
for  each  element  with  the  same  geometry,  and  evaluation  of  the  force  vector 
F  using  tensor  product  summation  instead  of  matrix  operations.  By  carefully 
organizing  matrix  usage,  the  overall  memory  requirement  scales  as 

SD  =  Uk2N2  +  s2KN 3  +  s3KN4.  (3.20) 

z 

As  mentioned  in  the  introduction  to  this  section,  the  direct  solver  is  advan¬ 
tageous  only  when  the  cost  of  factoring  this  stiffness  matrix  can  be  spread 
over  a  large  number  of  solutions.  Therefore,  we  consider  only  the  cost  of  a 
back-substitution  using  the  factored  stiffness  matrix,  for  which  the  operation 
count  scales  as 

CD  =  ci  A3/2A2  +  c2KN4  +  c3KN.  (3.21) 

For  a  well-conditioned,  diagonally-dominant  system  this  method  usually  re¬ 
sults  in  at  least  a  factor  of  two  savings  versus  an  iterative  solver.  For  a  system 
that  is  not  diagonally-dominant,  like  the  Navier-Stokes  pressure  equation,  it 
can  be  faster  by  a  full  order  of  magnitude. 

3.7  Examples 

Advection  As  a  model  for  the  nonlinear  term  in  the  Navier-Stokes  equa¬ 
tions,  we  now  look  at  a  linear  advection  equation.  It  can  be  written  as 

du 

— - a  ■  X7u  =  0  on  Q, 


(3.22) 
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where  u  is  a  scalar  and  a  is  a  given  velocity  vector  field  defined  on  Q.  For 
simplicity  we  assume  a  is  constant,  divergence  free,  and  normalized  so  that 
|a|  =  1  pointwise.  To  complete  the  statement  of  the  problem  we  must  also 
supply  boundary  and  initial  conditions  for  u,  but  we  leave  these  open  for  now. 
Equation  (3.22)  represents  the  transport  of  u  by  the  velocity  field  a,  and  it 
plays  an  important  role  in  many  areas  of  physics.  Here  we  will  be  concerned 
primarily  with  developing  stable  time  integration  schemes  to  go  along  with 
high-order  spatial  discretizations. 

The  weak  form  of  (3.22)  is:  Find  uh  E  Sh  such  that  for  all  wh  E  Vh 

[  wh(u-a-X7uh)df2  =  0,  (3.23) 

Jn 

where  u  =  duh/dt.  The  discrete  form  of  the  elemental  system  is 

Mkuk  -  Dkuk  =  0,  (3.24) 

where  the  elemental  mass  and  advection  matrices  are 

Mpj  =  (<Ap,  <j>q)nk )  Dp9  =  (a  ’  V0p,  <t>q)f2k  ■  (3.25) 

We  interpret  the  vector  u*  as  containing  either  the  nodal  values  of  the  solu¬ 
tion  or  the  expansion  coefficients  of  the  modal  basis  functions. 

Although  external  boundary  conditions  are  part  of  the  physical  statement 
of  the  problem,  to  form  the  global  system  and  complete  the  discretization  we 
have  to  choose  “internal”  boundary  conditions  for  the  subdomain  interfaces. 
One  possibility  is  to  use  the  method  of  characteristics,  which  reduces  to 
simple  upwinding  for  the  scalar  equation.  Alternatively,  C°  continuity  can 
be  imposed  by  forming  a  weighted  average  of  the  flux  a  ■  Vu  at  element 
boundaries,  using  the  mass  matrix  to  provide  the  weights.  This  procedure 
is  also  stable  for  smooth  solutions,  and  numerical  experiments  indicate  that 
for  well-resolved  problems  there  is  little  difference  between  the  accuracy  or 
stability  of  the  two  methods.  The  averaging  method  is  much  easier  to  program 
since  it  corresponds  to  the  “direct  stiffness  summation”  described  earlier;  in 
this  case  the  global  system  matrices  are  formed  as 

K  K 

M  =  ^'Mfc,  D  =  ^'D*,  (3.26) 

k-l  k= 1 

and  the  solution  is  u  =  M-1Du. 

Since  the  GLL  nodal  basis  functions  are  discretely  orthogonal,  the  asso¬ 
ciated  mass  matrix  is  diagonal  and  the  inversion  is  trivial.  The  modal  basis 
is  only  semi-orthogonal  and  the  corresponding  mass  matrix  is  sparse  but  not 
diagonal.  Since  the  modal  mass  matrix  is  symmetric  and  positive-definite,  we 
can  use  iterative  methods  to  invert  it  like  preconditioned  conjugate  gradient 
iteration  that  work  well  with  the  discrete  Laplacian  [21,24], 

To  propagate  the  solution  u  we  discretize  time  and  apply  a  numerical  time 
integration  scheme  with  some  step  size  At.  The  central  question  is  whether 
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the  method  and  time  step  we  choose  result  in  a  stable  scheme.  For  nonlin¬ 
ear  equations  like  Navier-Stokes,  explicit  methods  are  generally  used  for  the 
convective  terms  and  the  stability  is  determined  by  a  CFL-type  condition  of 
the  form 

.  .At  , 

|a|  —  <  const.  (3.27) 

However,  there  is  no  direct  analog  of  the  CFL  condition  for  high-order  methods 
we  have  to  make  a  heuristic  estimate  for  the  value  of  At  that  will  keep  the 
scheme  stable,  and  to  do  this  we  need  to  determine  the  growth  rate  of  the 
eigenvalues  of  the  discrete  system. 

Eigenvalues  of  the  linear  advection  operator  are  determined  by  the  non¬ 
trivial  solutions  (A,  u)  of 

(a  •  V  -  A)u  =  0.  (3.28) 

Eigenvalues  of  the  discrete  problem  are  determined  by  the  system 

(G  -  AI)  u  =  0,  (3.29) 

where  G  =  M-1  D.  This  yields  the  spectrum  associated  with  the  spatial 
discretization,  and  for  stability  the  eigenvalues  of  the  related  matrix  (1+ AtG) 
must  lie  within  the  stability  region  of  the  time  stepping  scheme.  To  state  this 
another  way,  the  time  step  At  must  balance  the  largest  eigenvalues  of  G. 

First  we  consider  the  modal  basis  on  triangular  elements,  using  a  peri¬ 
odic  domain  discretized  as  shown  in  Fig.  3.12.  We  can  determine  the  max¬ 
imum  eigenvalue  for  wavevectors  a  =  (cos  9,  sin  6)  corresponding  to  various 
directions  of  propagation  across  the  domain.  The  worst  case  (9  =  7r/4)  corre¬ 
sponds  to  a  wave  propagating  through  the  tip  of  the  triangle  where  the  mesh 
spacing  is  the  smallest.  Figure  3.13  shows  the  maximum  eigenvalue  versus 
expansion  order  N,  indicating  that  max(|A|)  ~  0(N2).  The  same  result  ap¬ 
plies  to  quadrilateral  elements  using  the  nodal  basis.  Figure  3.14  shows  the 
maximum  eigenvalues  for  a  simple  rectangular  domain,  again  demonstrating 
0(N2)  growth. 

From  this,  we  can  form  the  following  heuristic  stability  criteria: 

At  <  const. /\a\N2,  (3.30) 

where  the  constant  depends  on  the  particular  time  stepping  method  and 
the  uniformity  of  the  mesh.  Generally,  this  criterion  should  be  checked  on 
each  element  in  the  mesh  and  the  smallest  stable  value  of  At  chosen  for  the 
integration,  possibly  adapting  with  each  time  step.  Although  the  examples 
we  showed  were  for  two-dimensional  problems,  the  same  criteria  apply  to  one- 
and  three-dimensional  problems  as  well. 

The  explanation  for  the  stability  limit  given  by  (3.30)  is  that  the  poly¬ 
nomial  basis  clusters  the  mesh  points  near  the  ends  of  the  element,  so  that 
near  the  element  boundaries  Ax  ~  N~2.  This  estimate  is  standard  in  polyno¬ 
mial  spectral  methods  [34].  On  an  equispaced  grid  that  might  be  used  with  a 
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Fig.  3.12.  For  the  periodic  domain  shown  on  the  left  we  consider  a  wave  propagat¬ 
ing  with  velocity  a  =  (cos  6,  sin  8).  The  polar  plot  on  the  right  shows  the  maximum 
eigenvalue  of  the  discrete  advection  operator  for  several  wave  orientations  and  dif¬ 
ferent  number  of  modes  M  =  N. 


Fig.  3.13.  Growth  rate  of  the  maximum  eigenvalue  with  respect  to  polynomial 
order  N  for  the  modal  basis  [75]. 


log  |\| 
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Fig.  3.14.  Growth  rate  of  the  maximum  eigenvalue  of  the  discrete  advection  oper¬ 
ator  G  on  conforming  and  nonconforming  meshes  versus  polynomial  order  N  for 
the  nodal  basis. 
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Fourier  spectral  method  or  a  finite  difference  discretization,  the  mesh  spacing 
is  Ax  ~  -/V"1  and  this  limit  is  less  strict: 

At  <  const./ |a|  TV.  (3.31) 

In  order  to  weaken  the  limit  on  At  in  (3.30)  for  polynomial  spectral  methods, 
we  can  redistribute  the  collocation  points  to  achieve  a  more  even  distribution. 
Although  an  arbitrary  mapping  may  lead  to  unstable  approximations,  stable 
transformations  have  been  developed  that  can  result  in  a  CFL  limit  for  spec¬ 
tral  methods  quite  close  to  the  finite  difference  method  on  uniform  grids  [53]. 
In  practice,  a  typical  polynomial  order  is  IV  <  20  and  the  difference  between 
(3.30)  and  (3.31)  is  not  a  serious  disadvantage  to  the  more  straightforward 
approach. 


Diffusion  The  diffusion  of  a  scalar  u  with  diffusivity  v  is  described  by  the 
equation 

I 

—  -  „V2u  =  0.  (3.32) 

It  represents  a  type  of  “averaging”  of  u  that  might  describe  the  spreading 
of  heat,  momentum,  or  vorticity  in  a  fluid.  It  is  an  important  equation  that 
shows  up  in  many  branches  of  physics,  but  we  will  put  it  aside  for  the  moment 
in  favor  of  another  model  problem  for  the  approximation  of  elliptic  equations; 
at  the  end  of  this  section  we  show  how  the  two  are  related. 

The  Helmholtz  problem  is:  given  k  G  'll  and  smooth  functions  /  :  12  -»  'll, 
g  :  Tg  ->  TZ,  and  h  :  Th  — >  find  u  such  that 


V2u  —  k2u  +  f  =  0  in  O, 

(3.33) 

subject  to  the  boundary  conditions 

u  =  g  on  rg, 

(3.34) 

n  ■  Vu  =  h  on  Fh- 

(3.35) 

There  are  some  special  cases  of  equation  (3.33):  if  k  is  zero  it  is  called  Poisson’s 
equation,  and  if  k  and  /  are  both  zero  it  is  called  Laplace’s  equation. 

The  Galerkin  approximation  to  (3.33)  is  developed  in  much  the  same  way 
as  already  shown  for  the  one-dimensional  problem  in  Section  2.  We  only  need 
to  extend  the  ideas  to  two-  and  three-dimensions.  The  variational  form  of  our 
boundary  value  problem  is:  Find  u  £  S  such  that  for  all  w  £  V: 

a(u,  w)  =  (/,  w)  +  (h,  w)rh ,  (3.36) 

where  the  symmetric,  bilinear  form  a(-,  •)  is  defined  as 

a(u,w)  —  /  (VuVw  +  k2  uw)  dO. 

Jn 


(3.37) 
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Let  Sh  C  S  be  the  space  of  C°  piecewise  polynomial  interpolants  of  degree 
N  that  satisfy  the  essential  boundary  condition,  and  Vk  C  Va  similar  space 
of  functions  that  have  value  zero  on  these  are  our  basis  functions  <j>i(x). 
To  complete  the  Galerkin  approximation  to  (3.36),  we  separate  the  solution 
into  uh  =  gh  +  vh,  where  gh  G  Sh  is  a  polynomial  approximation  to  g  and 
vh  £  Vh  is  the  unknown  part  of  uh.  Usually,  gh  will  be  an  initial  guess  for 
uh  that  satisfies  the  boundary  conditions  but  not  the  weak  form,  so  vh  is 
simply  a  correction  to  make  it  exact.  Evaluation  of  the  integral  form  (3.36) 


by  numerical  quadrature  gives  the  elemental  matrices: 

=  a(<j)p,<t>q)  ftk ,  (3.38) 

Fp  =  (f,<f>p)nk  T  i}li^>p)rh  ~  a(9  t(l)p)nki  (3.39) 

and  the  discrete  Galerkin  equation  for  the  kth  element  as 

Akvk  =  Fk.  (3.40) 

To  form  the  global  system  there  is  only  one  choice  for  the  “internal”  boundary 
conditions,  and  that  is  to  apply  direct  stiffness  summation  to  get 

A  =  f>*,  P  =  f>‘.  (3.41) 

fc=l  fc=l 

The  final  algebraic  system, 

Ap<7  =  Fp,  p,  q  =  1, . . .  ,  rid  of  (3.42) 


requires  the  inversion  of  a  symmetric,  positive-definite  matrix  A  whose  band¬ 
width  is  determined  by  the  index  set  we  use  to  map  between  the  local  and 
global  systems. 

Now  we  return  to  the  problem  of  integrating  the  diffusion  equation.  We 
could  follow  the  same  approach  used  for  the  advection  equation,  writing  the 
semi-discrete  form  as 


Mu  =  i/Au.  (3.43) 

However,  the  discrete  Laplace  matrix  A  m  D2  and  for  stability  the  time  step 
would  scale  like  At  ~  N~4;  this  has  been  demonstrated  more  rigorously  for 
both  the  nodal  as  well  as  the  modal  basis  in  [40, 75].  For  this  reason,  the  diffu¬ 
sion  equation  is  usually  integrated  with  implicit  rather  than  explicit  methods. 
For  example,  we  can  approximate  the  time  derivative  with  an  unconditionally 
stable  backward  Euler  approximation: 


,n+l 


At 


=  i/V2u"+1. 


Rearranging  this,  we  get 

1  '  n+i  1 


(V2 


vAt 


)u 


(3.44) 

(3.45) 


which  is  immediately  recognized  as  the  Helmholtz  equation  with  k  =  \j\JvAt 
and  /  =  un/vAt.  After  developing  appropriate  methods  for  equation  (3.33), 
we  can  solve  any  implicit  approximation  to  the  diffusion  equation. 
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4  Adaptive  Mesh  Refinement 

Thus  far  we  have  looked  the  development  of  high-order  methods  that  incor¬ 
porate  the  essential  features  needed  to  adaptively  refine  the  discrete  model  of 
a  flow  during  a  simulation.  We  refer  to  this  as  dynamic  refinement.  In  spec¬ 
tral  or  h-p  finite  element  methods  refinement  takes  place  by  decreasing  the 
size  of  the  grid  elements  (h-refinement)  or  increasing  the  order  of  the  solution 
(p-refinement).  As  simple  as  this  sounds,  the  algorithms  for  driving  adaptive 
refinement  at  a  “high  level”  are  quite  complicated.  Adaptive  mesh  refinement 
is  often  as  much  of  an  art  as  a  science:  it  depends  on  the  experienced  selec¬ 
tion  of  tolerances  and  refinement  criteria  that  are  highly  problem-dependent. 
The  implementation  of  adaptive  methods  is  equally  complex  and  usually  in¬ 
volves  the  development  of  irregular,  dynamically  changing  data  structures 
that  reflect  the  complexity  of  the  discrete  models. 

The  basis  functions  that  we  have  looked  at  so  far  have  sufficiently  flexibil¬ 
ity  to  support  the  necessary  flavors  of  mesh  refinement.  High-order  expansions 
on  triangular  and  tetrahedral  elements  [75]  are  probably  the  most  straight¬ 
forward  because  refinement  can  be  implemented  without  any  fundamental 
changes  in  the  topology  of  the  mesh.  Quadrilateral  and  hexahedral  grids  do 
require  a  different  topology  to  be  efficient,  namely  nonconforming  element 
boundaries  between  regions  with  different  spatial  resolution.  Several  choices 
for  high-order  expansions  on  quadrilateral  grids  have  appeared  in  the  re¬ 
cent  literature,  including  Chebyshev  polynomials  combined  with  a  multipole 
expansion  for  the  Poisson  problem  [35],  high-order  B-spline  expansions  on 
locally  refined  grids  [72],  and  staggered-grid  Chebyshev  spectral  collocation 
methods  for  simulating  compressible  fluid  flows  [51,52].  These  methods  share 
a  common  thread  in  that  the  grids  used  to  discretize  space  look  similar,  but 
they  differ  in  both  the  way  an  approximate  solution  is  represented  and  how 
nonconforming  elements  of  the  mesh  are  pieced  together.  All  of  these  tech¬ 
niques  may  be  classified  as  spectral  element  methods  because  of  the  general 
combination  of  domain  decomposition  and  high-order  polynomial  expansions. 

In  this  section  we  look  at  the  implementation  of  a  high-order  adaptive  code 
based  on  the  nonconforming  spectral  element  method  developed  in  Sect.  3.  In 
practice  this  method  is  used  with  high-order  polynomials  {p  as  4  to  16)  and 
a  mesh  of  elements  that  is  generated  adaptively  by  h-refinement.  We  will  not 
attempt  to  refine  both  the  elements  and  the  basis  functions  simultaneously 
as  the  author’s  experience  indicates  that  uniformly  high  p  and  adaptive  mesh 
refinement  leads  to  an  efficient  solution  for  a  wide  variety  of  problems. 

The  formulation  based  on  mortar  elements  [14]  allows  completely  arbi¬ 
trary  assembly  of  nonconforming  elements.  However,  our  goal  is  to  develop 
automatic  procedures  for  generating  an  appropriate  mesh  and  this  calls  for 
some  compromises.  To  simplify  the  encoding  of  the  mesh  we  will  require 
the  refinement  to  propagate  down  a  quadtree  (two-dimensional  geometries) 
or  octtree  (three-dimensional  geometries).  A  basic  description  of  the  mesh 
generation  procedure  is  provided  in  Sect.  4.2.  This  is  found  to  be  a  suitable 
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restriction  for  problems  with  smooth  solutions  and  leads  to  a  significant  re¬ 
duction  in  the  complexity  of  the  data  structure  needed  to  represent  the  many 
levels  in  the  refined  grid.  For  complex  geometries  the  mesh  may  incorporate 
multiple  trees  at  the  coarse  level. 

To  give  a  more  specific  introduction  to  the  goals  of  developing  an  adap¬ 
tive  spectral  element  method,  Fig.  4.15  shows  a  sample  calculation  for  the 
impulsively  started  flow  past  a  bluff  plate.  In  this  simulation  the  solution  field 
is  generated  by  integrating  the  incompressible  Navier-Stokes  equations  from 
an  initial  state  of  zero  motion.  The  characteristic  scales  in  the  problem  are 
the  free-stream  speed  «<*, ,  the  plate  diameter  d,  and  the  kinematic  viscosity 
of  the  fluid  v.  The  Reynolds  number,  defined  as  Re  =  u^d/v,  is  set  to  the 
value  Re  =  1000.  The  lower  part  of  the  figure  shows  the  global  domain  used 
to  represent  the  flow  around  the  plate.  A  symmetry  condition  is  imposed 
along  the  centerline  so  that  only  one  half  of  the  flow  field  needs  to  be  com¬ 
puted.  The  upper  part  of  the  figure  is  an  enlargement  of  the  near  wake  region. 
It  shows  both  the  vorticity  of  the  developing  flow  at  an  early  time  and  the 
adaptively  generated  mesh.  Each  element  is  an  8  x  8  point  subdomain  ( p  =  7) 
of  the  global  solution.  A  large  number  of  separate  ‘trees’  are  needed  at  the 
coarse  level  to  correctly  model  the  beveled  geometry  of  the  finite-thickness 
plate.  The  initial  stage  of  mesh  generation  is  done  by  hand  to  provide  the 
correct  starting  geometry.  Once  the  problem  is  handed  to  the  flow  solver  the 
additional  adaptivity  in  the  mesh  is  based  on  a  maximum  allowable  approxi¬ 
mation  error  in  the  vorticity  field.  Because  the  algorithms  for  time  integration 
in  problems  like  the  one  illustrated  in  Fig.  4.15  are  generally  semi-implicit, 
the  computational  issues  that  arise  are  somewhat  different  when  compared  to 
other  methods  that  incorporate  adaptive  meshes.  We  are  interested  primar¬ 
ily  in  studying  incompressible  flows  governed  by  the  Navier-Stokes  and  Euler 
equations.  Because  of  the  elliptic  nature  of  the  governing  equations  (due  in 
part  to  the  incompressibility  constraint),  local  time-stepping  is  not  usually 
an  option.  Therefore,  solving  the  elliptic  boundary-value  problems  that  arise 
in  these  systems  is  a  particular  challenge.  Even  for  two-dimensional  flows 
the  resolution  needed  to  maintain  sufficiently  high  accuracy  can  lead  to  very 
large  systems  of  equations,  and  computational  efficiency  is  an  important  is¬ 
sue.  In  the  past  this  meant  algorithms  that  could  be  vectorized,  while  today 
it  means  algorithms  that  can  be  parallelized.  There  is  a  close  relationship 
between  spectral  elements  and  finite  elements,  so  when  it  comes  to  parallel 
computing  many  of  the  same  problems  (e.g.  load  balancing)  arise,  and  similar 
solutions  apply.  Section  4.4  addresses  the  implementation  of  this  method  for 
parallel  computers  with  a  programming  model  based  on  a  weakly  coherent 
shared  memory  which  is  synchronized  via  message  passing. 

Just  as  important  as  overall  computational  performance  are  the  algo¬ 
rithms  used  for  driving  adaptive  refinement.  Ideally  such  an  algorithm  would 
take  as  input  an  error  estimate  and  produce  as  output  a  new  discrete  model 
or  mesh  that  reduces  the  error.  The  basic  problems  are  the  lack  of  an  er¬ 
ror  estimate  for  nonlinear  systems  and  the  unlimited  ways  in  which  such 
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Fig.  4.15.  Simulation  of  the  impulsively  started  flow  past  a  bluff  plate  at  Re  =  1000 
using  an  adaptive  spectral  element  method:  (top)  close-up  of  the  mesh  and  vorticity 
of  the  flow  a  short  time  after  the  impulsive  start;  ( bottom )  global  computational 
domain. 
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an  algorithm  could  improve  the  discrete  model.  The  latter  problem  is  ad¬ 
dressed  by  restricting  ‘improvements’  to  propagating  refinement  down  the 
tree  as  described  in  Sect.  4.2.  The  former  problem  is  addressed  with  a  pseudo¬ 
heuristic  error  estimate  based  on  the  local  polynomial  spectrum  as  described 
in  Sect.  4.3.  Depending  on  the  nonlinearity  in  the  partial  differential  equa¬ 
tions  being  solved,  parts  of  the  spectrum  will  give  an  accurate  approximation 
to  the  true  solution  and  parts  will  be  polluted.  We  estimate  the  order  of  mag¬ 
nitude  of  the  local  error  by  examining  the  decay  along  the  tail  of  the  local 
polynomial  spectrum.  In  a  general  sense,  this  heuristic  flags  locations  in  the 
mesh  where  the  polynomial  basis  fails  to  provide  a  good  description  of  the 
solution.  For  simple  problems  (linear,  one-dimensional)  this  can  be  formally 
related  to  the  true  difference  between  the  exact  solution  and  the  approxi¬ 
mate  solution,  i.e.  the  approximation  error.  For  more  interesting  problems  it 
is  shown  to  be  a  robust  guide  for  driving  adaptivity.  The  heuristic  is  easy 
to  compute  but  is  only  accurate  as  an  error  estimate  in  computations  with 
sufficiently  high  p,  meaning  that  the  local  polynomial  coefficients  should  de¬ 
cay  like  |an|  ~  exp(— an)  for  p  =  n  1.  This  is  generally  not  true  near 
singular  points  (e.g.  corners)  and  these  locations  are  automatically  flagged 
for  refinement.  The  method  based  on  local  spectra  is  compared  to  simpler 
heuristics  such  as  refining  in  regions  with  strong  gradients  and  the  two  are 
shown  to  lead  to  quite  different  results.  In  general  the  local  spectrum  works 
well  and  is  a  good  match  to  the  overall  computational  strategy. 

The  effectiveness  of  this  approach  is  first  demonstrated  in  Sect.  4.5  for 
scalar  problems  where  the  convergence  and  behavior  of  the  refinement  criteria 
can  be  checked  carefully.  More  complicated  examples  are  provided  in  Sect.  5 
with  several  incompressible  flow  problem.  An  attempt  is  made  throughout 
to  illustrate  both  the  benefits  and  difficulties  of  using  this  kind  of  high-order 
adaptive  method,  and  to  point  out  applications  where  it  may  have  some 
advantage  over  other  numerical  methods. 


4.1  Framework 

In  this  section  we  restrict  our  attention  to  two-dimensional  problems.  Most  of 
the  difficulties  arise  in  two  dimensions  and  there  are  no  fundamental  barriers 
(other  than  computing  power)  in  extending  the  method  to  three  dimensions. 
To  begin,  let  D  be  some  region  of  space  that  has  been  partitioned  into  K 
subdomains  which  we  denote  D^k\  We  consider  two  related  problems: 

1.  Given  a  discretization  tolerance  e,  generate  a  spatial  discretization  D  — 
{ D 1*1}  that  allows  the  tolerance  to  be  met; 

2.  Given  a  spatial  discretization  D  =  {D^},  generate  a  finite-dimensional 
approximation  uh  sa  u.  The  function  u  may  be  given  explicitly  or  implic¬ 
itly,  i.e.  as  the  solution  of  a  boundary-value  problem. 

Our  approach  to  problem  (1)  is  to  create  a  hierarchy  of  grids  by  forming  a 
quadtree  partition  of  D.  This  provides  the  computational  domain  for  problem 
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(2)  where  we  apply  a  nonconforming  spectral  element  method  to  approximate 


4.2  Mesh  generation 

The  mesh  generation  problem  is  somewhat  simpler,  so  we  describe  that  first. 
A  quadtree  is  a  partition  of  two-dimensional  space  into  squares.  Each  square 
is  a  node  of  the  tree.  It  has  up  to  four  ,  daughters,  obtained  by  bisecting  the 
square  along  each  dimension.  Each  node  in  a  quadtree  has  geometrical  proper¬ 
ties  (spatial  coordinates,  size)  and  topological  properties  (parents,  daughters, 
siblings).  Geometrical  properties  of  daughter  nodes  are  inherited  from  par¬ 
ents,  and  thus  the  geometrical  properties  of  the  entire  tree  are  determined 
by  the  root  node. 

To  represent  the  topological  aspects  of  the  tree  we  use  an  idea  originally 
developed  for  gravitational  JV-body  problems  [70].  Every  possible  square 
is  assigned  a  unique  integer  key.  The  root  of  the  tree  is  with  key  1.  The 
daughters  of  any  node  are  obtained  by  a  left-shift  of  two  bits  of  the  parent’s 
key,  followed  by  a  binary  or  in  the  range  00-11  (binary)  to  distinguish  each 
sibling.  A  node’s  parent  is  obtained  by  a  two  bit  right-shift  of  its  own  key. 
Since  the  set  of  keys  installed  in  the  tree  at  any  time  is  obviously  much  smaller 
than  the  set  of  all  possible  keys,  a  hash  table  is  used  for  storage  and  lookup. 
From  the  complete  set  of  nodes  in  the  tree  we  choose  a  certain  subset  D ^  C 
5 W  to  form  the  active  elements  of  the  computational  domain.  Figure  4.16 
shows  a  four-level  quadtree  with  thirteen  nodes  and  K  =  10  active  elements. 
Active  elements  in  the  figure  are  shown  with  a  solid  outline  while  inactive 
elements  are  shown  with  a  dashed  outline.  Inactive  elements  are  retained 
so  that  they  are  available  for  coarsening  the  mesh,  if  necessary.  The  only 
requirement  enforced  on  the  topology  of  the  mesh  is  that  active  elements  that 
share  a  boundary  segment  live  at  most  one  refinement  level  apart,  limiting 
adjacent  elements  to  a  two-to-one  refinement  ratio.  This  imposes  a  certain 
smoothness  on  the  change  in  resolution  in  the  mesh  that  is  appropriate  for 
the  class  of  smooth  functions  we  wish  to  represent. 

4.3  Refinement  criteria 

The  adaptive  mesh  generation  described  above  and  high-order  domain  de¬ 
composition  methods  described  in  §3  are  coupled  through  the  refinement 
criteria  used  to  drive  adaptivity.  Here  we  consider  three  types  of  refinement 
criteria. 

The  first  is  by  far  the  simplest:  refine  everywhere  that  solution  gradients 
are  large.  We  can  enforce  this  idea  by  requiring 

II  Vu(fc)  II  <  e  ||  uh  111  (4.46) 

everywhere  in  the  mesh,  where  ||  ■  ||  is  the  norm,  ||  •  ||i  is  the  H1  norm, 
and  e  is  the  discretization  tolerance.  This  is  a  common  refinement  criteria  in 
cases  where  there  is  simply  no  alternative  measure  of  solution  errors. 
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Fig.  4.16.  A  four-level  quadtree  mesh,  expanded  to  show  the  elements  that  make  up 
each  level.  Each  leaf  node  S ^  has  a  unique  integer  key  shown  in  binary.  Daughter 
keys  are  generated  from  a  parent’s  key  by  a  two-bit  left  shift,  followed  by  a  binary  or 
in  the  range  00-11.  The  active  elements  D'k)  that  make  up  the  current  discretization 
are  shown  with  a  solid  outline. 


The  second  type  takes  direct  advantage  of  the  high-order  polynomial  ba¬ 
sis.  Consider  the  expansion  of  a  given  smooth  function  it  over  the  domain 
D  =  [—  1,  l]2  in  terms  of  Legendre  polynomials: 


OO  OO 

u(x,y)  =  an,mPn{x)Pm{y)-  (4-47) 

71=0  771=0 

The  expansion  coefficients  are  given  by 

=  f  f  'Ll  P n,P m  \  J\  dy, 

Cn,Cm  J  —  1  J  —  1 


where  the  normalization  constant  is  a  =  (i  +  1/2)-1.  We  have  included 
the  Jacobian  |  J\  to  include  the  effects  of  element  size  and  other  geometric 
transformations,  e.g.  curvilinear  boundaries.  There  is  nothing  magical  about 
Legendre  polynomials — they  are  simply  a  convenient  orthogonal  basis  for 
projecting  the  approximation  onto.  Since  our  approximate  solution  uh  sa  u 
is  formed  essentially  by  truncating  this  expansion  at  some  finite  order  p,  we 
can  form  an  estimate  of  the  approximation  error  ||  u  —  uh  ||  by  examining  the 
tail  of  the  spectrum. 
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To  do  so  we  first  average  over  polynomials  in  x  and  y  to  produce  an 
equivalent  one-dimensional  spectrum: 

p- 1 

cip  —  | aPiP|  T  |  Oj.p|  +  |op.j|.  (4.49) 

i= o 

Next  we  replace  the  discrete  spectrum  ap  with  an  approximation  to  a  decaying 
exponential: 

a(n)  —  const,  x  exp(— an).  (4.50) 

The  function  a(n )  is  a  least  squares  best  fit  to  the  last  four  points  in  the 
spectrum  ap.  Our  refinement  criteria  becomes 

r OO  \  1/2 

a(p)2  +  /  d(n)2  dn  ]  <  e  ||  uh  || 

JP+ 1  ) 

The  only  practical  complication  here  is  making  sure  the  decay  rate  a  >  0 
so  that  the  integral  converges.  Otherwise,  the  estimate  is  ignored  and  the 
element  is  flagged  for  immediate  refinement.  This  method  is  analyzed  in  [60] 
where  it  is  shown  to  be  an  effective  refinement  criteria  for  driving  h-p  refine¬ 
ment. 

The  third  refinement  criteria  is  similar.  Since  the  main  contribution  to  (4.51) 
comes  from  the  coefficients  of  order  p,  we  can  simply  sum  along  the  tail  of 
the  spectrum.  For  an  accurate  representation  of  u  we  require  the  spectrum 
to  satisfy  the  discretization  tolerance: 

p- 1 

|up.p|  T  ^  ]  |ni,pl  T  lop^l  ^  e  ||  u  || 
i= 0 

This  method  is  somewhat  simpler  to  apply  and,  as  we  will  see,  produces 
almost  identical  results. 

To  use  these  polynomial  spectrum  criteria  with  our  spectral  element 
method  (based  on  GLL  polynomials)  we  first  perform  a  Legendre  transform 
of  the  local  solution  u[k^  -4  an.m  and  then  use  (4.51)  or  (4.52)  to  decide 
if  the  element  should  be  refined.  Although  we  keep  p  fixed,  the  error  is  re¬ 
duced  because  we  approximate  u  over  a  smaller  region  .  This  basic  idea 
is  illustrated  in  Fig.  4.17.  Here  a  smooth  function  f(x,y)  has  been  projected 
onto  the  Legendre  polynomials  of  order  p  <  64.  For  a  given  order  p  the  true 
approximation  error  would  be  given  by  the  sum  over  all  coefficients  not  con¬ 
tained  in  the  box  m,n  <  p.  We  estimate  the  magnitude  of  that  error  by 
simply  summing  coefficients  along  the  solid  lines. 

4.4  Implementation  notes 

We  end  this  section  with  a  few  additional  notes  on  implementation.  The  algo¬ 
rithms  described  above  have  been  implemented  using  a  combination  of  C  for 


(4.52) 


(4.51) 
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Fig.  4.17.  Error  estimate  based  on  the  local  polynomial  spectrum:  (a)  contours  of 
the  function  /  =  1/(1  +  25r2);  (6)  scatter  plot  showing  the  polynomial  coefficients 
\an,m\-  An  estimate  of  the  basis  accuracy  for  a  given  polynomial  order  is  formed  by 
summing  |an,m|  along  the  solid  lines:  e  =  0.0962  {p  =  16);  e  =  0.000314  (p  =  32); 
e  =  8.25  x  10“7  (p  =  48). 
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the  computational  modules  and  Q+  for  high-level  data  types  like  Element 
=  D^k)  and  Field  =  uh  that  make  up  the  discretization.  The  logic  and  con¬ 
trol  structure  needed  for  most  of  the  code  are  the  same  as  in  any  algorithm 
for  finite  element  methods.  The  most  complex  problem  is  maintaining  the 
connectivity  of  the  mesh  dynamically,  and  the  approach  taken  here  is  worth 
mentioning.  The  geometry  and  topology  of  the  mesh  are  closely  connected. 


Fig.  4.18.  The  logical  structure  of  a  spectral  element  mesh  can  be  divided  into 
three  geometric  parts:  (o)  vertices,  ( — )  edges,  and  (shaded)  interiors.  Edges  and 
vertices  define  the  connectivity  in  the  mesh. 


Figure  4.18  shows  the  three  geometric  elements  of  the  discretization:  vertices, 
edges,  and  interiors.  Obviously  interior  points  are  completely  local  to  an  el¬ 
ement  and  play  no  role  in  the  global  system.  All  connectivity  in  the  mesh  is 
through  the  edges  and  vertices.  Because  of  the  method  used  to  construct  the 
grid  these  geometric  elements  are  interlocking.  The  midpoint  of  each  noncon¬ 
forming  edge  aligns  with  the  shared  vertex  of  its  two  adjacent  elements.  As 
discussed  below,  this  feature  is  used  to  simplify  the  procedure  for  setting  up 
the  mesh  topology. 

Figure  4.18  shows  one  other  side  effect  of  the  mesh  generation.  Internal 
curvilinear  boundaries  are  automatically  propagated  down  the  various  levels 
of  the  refinement  tree  because  of  the  isoparametric  representation  of  the 
geometry.  In  the  same  way  that  a  solution  field  is  projected  onto  a  new 
set  of  elements,  the  polynomial  representation  of  the  geometry  can  also  be 
projected  to  a  finer  grid.  On  the  other  hand,  external  boundaries  like  the 
B-spline  segment  shown  as  the  lower  boundary  in  the  figure  are  explicitly 
re-evaluated  to  keep  the  representation  as  accurate  as  possible. 

How  does  one  represent  the  topology  of  this  kind  of  mesh?  One  solution 
is  to  use  pointers.  This  immediately  runs  into  the  problem  of  interpreting 
pointers  to  objects  on  remote  processors  if  the  computation  is  running  in 
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parallel.  Instead  we  use  the  concept  of  a  voxel  database  (VDB)  of  geometric 
positions  in  the  mesh  [85].  A  VDB  may  be  thought  of  as  register  of  position- 
subscript  pairs.  To  each  position  stored  in  the  VDB  we  assign  a  unique  integer 
subscript  so  that  data  may  be  associated  with  points  in  space  by  using  the 
subscript  as  an  index  into  an  array. 

The  basic  idea  is  illustrated  in  Fig.  4.19.  The  number  of  times  a  position 
is  registered  is  its  multiplicity.  Data  objects  that  share  positions  also  share 
memory  by  virtue  of  a  common  subscript.  In  essence  the  VDB  provides  a 
natural  map  of  the  mesh  geometry  onto  the  computer’s  memory.  This  basic 
paradigm  can  be  used  to  implement  many  types  of  finite  element  or  finite 
volume  methods  [85]. 


Fig.  4.19.  Connectivity  and  communications  axe  established  by  building  a  voxel 
database  (VDB)  of  positions.  A  VDB  maps  each  position  to  a  unique  index  or 
subscript.  It  also  tracks  points  shared  by  multiple  processors  to  provide  a  loosely 
synchronous  shared  memory.  Points  that  share  memory  are  those  at  the  same  geo¬ 
metric  position. 


To  establish  the  connectivity  of  a  mesh  like  the  one  depicted  in  Fig.  4.18 
we  build  two  separate  VDBs:  one  for  the  vertices  and  one  for  the  midpoints 
of  the  edges.  Every  vertex  with  multiplicity  one  that  does  not  lie  along  an 
external  boundary  is  virtual  and  not  part  of  the  true  mesh  degrees  of  freedom. 
Every  edge  with  multiplicity  one  that  does  not  lie  along  an  external  boundary 
is  nonconforming.  For  each  nonconforming  edge  we  make  a  second  query  to 
the  VDB  using  the  endpoints.  If  there  is  a  match  then  the  edge  is  also  virtual 
and  we  store  the  subscript  of  the  adjacent  edge.  Otherwise  it  is  simply  flagged 
as  an  internal  nonconforming  boundary  segment. 

The  shared  memory  represented  by  a  VDB  is  extended  across  processor 
boundaries  by  passing  around  a  list  of  local  positions  and  comparing  against 
those  registered  remotely.  A  communications  link  is  established  for  each  com¬ 
mon  position.  The  shared  memory  at  each  point  is  weakly  coherent  and  must 
be  synchronized  by  explicit  message  passing.  For  example,  elements  on  sepa- 
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rate  processors  with  a  common  boundary  segment  share  data  along  an  edge. 
Each  processor  may  update  its  edge  values  independently  and  then  call  a 
synchronization  routine  that  combines  local  and  remote  values  to  produce  a 
globally  consistent  data  set.  For  further  details  see  [85]. 

There  is  very  little  overhead  for  the  adaptive  versus  non-adaptive  data 
structure:  just  one  integer  (the  node  key)  per  element.  Likewise,  an  iterative 
solver  for  sparse  systems  incurs  no  performance  penalty  just  because  the  un¬ 
derlying  mesh  is  adaptive.  When  approached  in  the  right  way  the  conversion 
to  a  solution  adaptive  code  is  almost  trivial.  To  a  large  degree  this  is  because 
of  the  unstructured  nature  of  the  spectral  element  method  we  built  upon. 

4.5  Examples 

Next  we  illustrate  the  performance  of  the  method  with  a  few  simple  test 
problems.  First  we  consider  the  solution  of  the  Poisson  equation  V2u  =  / 
with  the  right-hand-side  given  by 

f(x,  y)  =  (4002r2  -  800)e-4OOr2/2,  (4.53) 

where  r2  =  x2  +  y2.  The  exact  solution  is  given  by 

u(x,y)  =  e~400r2/2.  (4.54) 

We  take  the  computational  domain  D  =  [-0.5, 0.5]2  and  impose  homogeneous 
boundary  conditions  u  =  0  along  the  perimeter  dD.  This  same  test  case  is 
studied  in  [35]  to  check  the  performance  of  a  fast  multipole  method  using  a 
similar  type  of  spatial  discretization. 


Table  4.4.  Solution  times  and  relative  errors  for  solving  the  Poisson  equation  on  a 
uniform  grid  with  order  p  =  7  elements.  Columns  (I)  and  (II)  show  the  estimated 
error  using  the  exponential  fit  and  summing  the  trace  of  the  Legendre  polynomial 
spectrum,  respectively. 


No.  levels  No.  points  (I) 

(II)  \\u-u*\\ 

0 

64 

0.534 

1 

256  0.0127 

0.0117  0.0113 

2 

1024  0.000762 

0.000735  0.000389 

3 

4096  2.625  x  10"6 

2.575  x  10“6  1.318  x  10~6 

4 

16384  1.189  x  10-7 

1.187  x  10-7  8.212  x  10~8 

Since  this  problem  has  a  well-defined  exact  solution,  we  begin  by  compar¬ 
ing  the  error  estimates  and  the  true  error  [|  u  —  uh  ||  on  a  uniformly  refined 
grid  (table  4.4).  This  table  shows  that  the  error  estimates  are  actually  quite 
sharp,  differing  from  the  L2  error  by  only  a  small  multiplicative  factor.  Also 
note  that  the  spectrum-based  estimate  are  nearly  equivalent.  This  is  true  in 


280  Ronald  D.  Henderson 


general.  Because  the  trace  is  easier  and  faster  to  compute,  this  is  the  method 
that  will  be  used  from  this  point  forward  unless  noted  otherwise. 

Adaptive  mesh  generation  based  on  the  different  refinement  criteria  is 
illustrated  in  Fig.  4.20.  In  this  case  both  methods  produce  roughly  equivalent 
discretizations.  The  grid  is  refined  in  approximately  the  same  location  and  to 
the  same  depth  for  a  given  discretization  tolerance.  Both  methods  generate  a 
six-level  quadtree  with  roughly  the  same  number  of  active  elements  ( K  »  300) 
using  a  uniform  basis  of  order  p  =  7. 


Fig.  4.20.  Adaptive  mesh  generation:  ( center )  contours  of  the  function  f(x,y)  in¬ 
side  the  computational  domain;  (left)  adaption  based  on  function  gradients  with  a 
tolerance  of  e  =  0.0863;  (right)  adaption  based  on  the  local  Legendre  polynomial 
spectrum  with  a  tolerance  of  e  =  9.01  x  10“ 7 . 


For  the  second  example  we  consider  the  solution  of  the  Poisson  equation 

V2it  +  1  =  0,  (4.55) 

on  the  same  domain  D  with  homogeneous  boundary  conditions  u  =  0  on  dD. 
The  structure  of  the  solution  is  quite  different,  as  shown  in  Fig.  4.21.  There 
is  a  weak  singularity  in  the  corners  of  the  domain  where  the  solution  must 
simultaneously  match  the  curvature  and  the  boundary  conditions.  However, 
the  solution  gradients  are  largest  along  the  edges  of  the  domain  where  the 
structure  of  u  is  rather  simple.  In  this  case  our  two  refinement  criteria  lead  to 
nearly  complementary  grids.  Clearly  the  local  polynomial  spectrum  indicates 
the  correct  location  for  refinement  while  the  magnitude  of  solution  gradients 
can  be  misleading.  In  this  case  mesh  refinement  based  on  solution  gradients 
completely  misses  the  location  (e.g.  the  corners)  where  the  errors  are  largest. 


4.6  Summary 

We  have  outlined  the  basic  features  and  implementation  of  an  adaptive  spec¬ 
tral  element  method.  Perhaps  the  most  interesting  part  of  the  method  is  the 
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Fig.  4.21.  Adaptive  solution  of  the  Poisson  equation  with  corner  singularities:  ( cen¬ 
ter )  contours  of  the  solution  computed  on  a  uniform  fine  grid;  {left)  adaption  based 
on  the  solution  gradients  with  a  tolerance  of  e  =  0.3;  (right)  adaption  based  on  the 
local  Legendre  polynomial  spectrum  with  a  tolerance  of  e  =  10-8. 


‘built-in’  refinement  criteria  provided  by  the  local  polynomial  spectrum.  This 
provides  a  heuristic  error  estimate  that  is  independent  of  the  system  being 
solved.  The  local  spectral  properties  were  shown  to  be  a  useful  and  relatively 
robust  criteria  for  both  simple  linear  problems  (the  Poisson  equation).  In  the 
following  section  we  will  look  at  more  complex  nonlinear  problems. 

One  area  of  potentially  great  improvement  is  in  the  algorithms  used  to 
implement  the  sparse  matrix  solver.  For  example,  recent  work  on  fast  multi¬ 
pole  methods  for  spectral  elements  shows  great  promise  for  solving  Poisson 
and  Helmholtz  equations  [35].  There  are  a  host  of  other  possibilities  that 
take  better  advantage  of  the  multilayer  structure  of  the  grid  than  the  more 
straightforward  CG  iterations  considered  here.  Also  note  that  direct  solvers 
are  still  feasible  in  adaptive  calculations  as  long  as  a  relatively  large  number 
of  elliptic  solves  take  place  between  adaption  steps. 


5  Fluid  Dynamics 


Advances  in  both  computer  technology  and  numerical  methods  have  opened 
new  possibilities  for  the  study  of  fluid  dynamics  through  large-scale  simula¬ 
tions.  Building  on  the  spatial  discretizations  presented  so  far,  we  now  turn 
to  the  solution  of  the  Navier-Stokes  equations  for  unsteady  two  and  three- 
dimensional  problems  and  show  that  high-order  splitting  methods  reduce  the 
computational  burden  to  solving  a  series  of  Helmholtz  problems. 
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5.1  Incompressible  flows 

We  consider  here  Newtonian  fluids  with  constant  density  p  and  kinematic 
viscosity  v,  the  motion  of  which  is  governed  by  the  the  incompressible  Navier- 
Stokes  equations: 


V  •  u  =  0  in  Q, 

(5.1a) 

dtu  =  N (u)  —  —  Vp  +  -i- V2u  in  Q, 
p  Re 

(5.1b) 

where  u  =  (u\ ,  U2,u3)  is  the  velocity  field,  p  is  the  static  pressure,  Re  =  UL/v 
is  the  Reynolds  number,  and  Q  is  the  computational  domain.  These  equations 
are  written  in  non-dimensional  form  where  velocities  are  scaled  by  U  and 
lengths  by  L.  Without  loss  of  generality  we  take  the  numerical  value  of  p  =  1 
since  this  simply  sets  the  scale  for  p.  N(«)  represents  the  nonlinear  advection 
term: 

N(tt)  =  —  (u  ■  V)u,  (5.2a) 

= -^  [(«•  V)rr  +  V  •(««)] ,  (5.2b) 

=  — ^V(rt  ■  u)  -  u  x  V  x  u.  (5.2c) 

We  refer  to  these  as  the  convective  form,  skew-symmetric  form,  and  rotational 
form,  respectively.  These  three  forms  for  N(u)  are  mathematically  equivalent 
but  behave  differently  when  implemented  for  a  discrete  system.  As  shown  by 
Zang  [91],  the  skew-symmetric  form  is  the  most  robust;  this  form  is  used  in 
all  calculations  unless  noted  otherwise. 

The  Navier-Stokes  equations  are  coupled  through  the  incompressibility 
constraint  V  •  u  =  0  and  the  nonlinear  term  N(ti).  However,  the  biggest 
challenge  for  time-integration  comes  from  the  linear  term: 

L(u)  =  4-V2u.  (5.3) 

He 

This  term  is  responsible  for  the  fastest  time  scales  in  the  system  and  thus 
poses  the  most  severe  constraint  on  the  maximum  allowable  time  step  for 
numerical  integration  of  the  fluid  equations.  Problems  associated  with  the 
stiffness  of  the  linear  operator  are  handled  by  treating  this  term  implicitly, 
while  the  nonlinear  term  can  be  integrated  with  an  easier  explicit  method. 


Semi-discrete  formulation  To  solve  the  Navier-Stokes  equations,  (5.1b) 
is  integrated  over  a  single  time  step  to  obtain: 

ft+At  j 

u(t  +  At)  —  u(t)  +  /  [N(u) - S/p  +  L(u)]  dt. 

h  P 


(5.4) 
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Next  we  introduce  a  discrete  set  of  times  tn  =  nAt  where  the  solution  is  to 
be  evaluated,  and  define  un  =  u(x,tn )  as  the  semi-discrete  approximation 
to  the  velocity  (discrete  in  time,  continuous  in  space).  For  reasons  that  will 
be  explained  in  a  moment,  the  pressure  integral  is  replaced  with: 


:Vpdf. 


Next  we  introduce  appropriate  integration  schemes  for  the  linear  and  nonlin¬ 
ear  terms.  The  simplest  implicit/explicit  scheme  would  be  first-order  Euler 
time  integration: 


L (u)  d t  «  At  L(u" 


N(u)  dtnAtN(un). 


Combining  (5.5)-(5.7)  we  get  a  semi-discrete  approximation  to  the  momen¬ 


tum  equation: 

un+1  =/  +  [N(tt")  -  Vp  +  L(un+1)]  At.  (5.8) 

This  system  of  equations  can  be  solved  by  further  splitting  (5.8)  into  three 
substeps  as  follows: 

u(1>  -un  =  AfN{un),  (5.9a) 

u W  -u^  =  -AtVp,  (5.9b) 

un+1  -  u(2)  =  AtL(un+1).  (5.9c) 


Here  u ^  and  are  intermediate  velocity  fields  that  progressively  incorpo¬ 
rate  the  nonlinear  terms  and  the  incompressibility  constraint.  The  motivation 
for  the  splitting  is  to  decouple  the  pressure  term  from  the  advection  and  dif¬ 
fusion  terms. 

The  classical  splitting  scheme  proceeds  by  introducing  two  assumptions: 
that  satisfies  the  divergence  free  condition  (V  •  =  0),  and  that 

satisfies  the  correct  Dirichlet  boundary  conditions  in  the  direction  normal  to 
the  boundary  (n  ■  —  n  •  n”+1).  Incorporating  these  assumptions,  we  can 

derive  a  separately  solvable  elliptic  problem  for  the  pressure  in  the  form: 

V2p=if(V‘u(1))-  (5.10) 

The  field  p  becomes  a  dynamic  variable  that  couples  the  divergence-free  con¬ 
dition  and  the  momentum  equation.  The  correct  Neumann  boundary  condi¬ 
tions  for  p  come  from  (5.8),  which  can  be  simplified  to  the  form: 

=  n  ■  [N(un)  -  -J-V  x  V  x  «"].  (5.11) 

on  Re 
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This  boundary  condition  prevents  the  propagation  and  accumulation  of  time 
differencing  errors  and  ensures  that  p  satisfies  the  important  pressure  com¬ 
patibility  condition  [48].  Note  that  the  linear  term  in  (5.11)  is  derived  from 
L(u")  rather  than  L(ttn+1).  This  type  of  first-order  extrapolation  is  neces¬ 
sary  to  keep  the  pressure  equation  decoupled  from  the  other  substeps.  The 
order  of  the  extrapolation  should  be  consistent  with  the  overall  time  accuracy. 


Higher-order  schemes  It  is  relatively  easy  to  make  the  integration  scheme 
outlined  above  more  accurate  in  time,  i.e.  to  increase  the  time  accuracy  to 
0(AtJ).  The  basic  idea  is  to  use  higher-order  multi-step  schemes  for  the  time 
integration.  Time  derivatives  can  be  approximated  with  a  backward  difference 
of  the  form: 

j- 1 

dtu  k  At  (7o«n+1  -  52  aq  un~9),  (5-12) 

q= o 


where  'Jo  =  ^2  aq  for  consistency.  The  nonlinear  term  can  be  integrated  using 
an  Adams-Bashforth  method: 


(5.13) 


where  J2  Pq  —  1-  The  pressure  boundary  conditions  should  be  integrated  with 
a  scheme  of  the  same  order  to  ensure  consistent  time  accuracy: 


£  =  n  •  £  /3q  [N(un-?)  -  i-V  x  V  x  «»"«].  (5.14) 

Combining  these  various  integration  schemes  produces  the  following  semi¬ 
discrete  equations: 


j-i  J- l 

-  52  aqun~q  =  At^pq  N(«n-«),  (5.15a) 

q= 0  q= 0 

=  —AtVp  (5.15b) 

70 un+1  -  uW  =  AtL{un+1).  (5.15c) 


This  method  would  typically  be  used  with  J  —  2  or  3  and  an  integration 
rule  like  one  of  the  schemes  given  in  table  5.1.  Overall,  (15)  provides  an  very 
efficient  way  to  integrate  the  Navier-Stokes  equations. 


Two-dimensional  simulations  A  single  time  step  using  the  skew-symmetric 
form  of  the  nonlinear  terms  requires  ten  spatial  derivatives  plus  the  solution 
to  one  Poisson  equation  for  the  pressure  and  two  Helmholtz  equations  for 
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Table  5.1.  Integration  coefficients  for  multi-step  schemes  of  order  J:  (top)  classic 
Adams-Bashforth  schemes;  ( bottom )  stiffiy-stable  schemes  with  coefficients  derived 
for  a  model  advection-diffusion  equation  [48]. 


J  7o 

ao  ct\  <22 

Po  Pi  p2 

1 

1 

i 

1 

2 

1 

i 

3/2  -1/2 

3 

1 

i 

23/12  -4/3  5/12 

1 

1 

i 

1 

2  3/2 

2  -1/2 

2  -1 

3  11/16 

3  -3/2  1/3 

5/2  -2  1/2 

the  diffusion  in  each  direction.  Most  of  the  computational  work  is  associ¬ 
ated  with  solving  these  linear  systems;  collocation  is  used  to  integrate  the 
nonlinear  terms  and  makes  only  a  minor  contribution.  The  techniques  out¬ 
lined  in  Sect.  3  can  be  applied  directly  to  the  solution  of  the  various  elliptic 
subproblems. 

Note  that  the  pressure  is  indeterminant  to  within  a  constant  in  a  two- 
dimenional  calculation  with  all  Neumann  boundary  conditions.  This  is  be¬ 
cause  for  any  field  p(x,y )  that  satisfies  (5.10),  the  field  p(x,y)  +  c  is  also 
a  solution,  for  any  constant  c.  This  ambiguity  can  be  removed  in  a  direct 
method  by  setting  exactly  one  pressure  degree-of-freedom  to  zero,  typically 
the  last  element  in  the  array  of  pressure  boundary  unknowns.  For  iterative 
methods  it  is  sufficient  to  set  the  mean  of  the  initial  residual  to  zero. 


Three-dimensional  simulations  We  can  simulate  three-dimensional  flows 
in  one  of  several  ways.  If  the  geometry  is  fully  three-dimensional  we  have  to 
use  hexahedral  or  tetrahedral  spectral  elements  [73].  If  the  problem  has  one 
of  several  symmetries  —  axisymmetric,  spherically  symmetric,  or  periodic  in 
one  direction  —  then  Fourier  expansion  in  one  direction  becomes  a  much 
more  efficient  way  to  represent  the  flow. 

Consider  the  case  of  a  flow  that  is  periodic  in  the  z-dircction  and  satisfies 
the  symmetry 

u(x,y,z,t)  =u(z,y,z  +  L,t). 

Under  these  conditions  u  can  be  projected  exactly  onto  a  set  of  two-dimensional 
Fourier  modes  uq  as 


rL 

uq(x,y,t)  =  L_1  /  u(x,  y,z,  t)e~^2n^L^qz  dz. 
Jo 
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Likewise,  the  Fourier  modes  uq  given  the  expansion  of  the  velocity  field  in  a 
Fourier  series: 


oo 

u(x,y,z,t)  =  ^2  uq(x,y,t)e^2lT/L)qz. 

q=— oo 


Substituting  the  Fourier  expansion  of  the  velocity  field  into  the  Navier-Stokes 
equations,  we  obtain  a  coupled  set  of  equations  for  the  Fourier  modes.  To 
simplify  the  notation,  we  define  the  scaled  wavenumber  j3q  =  (2n /L)q  and 
the  (/-dependent  operators 

V  =  (dx,dy,  i/y,  V2  =  (dl,dy,-(32). 

The  evolution  equation  for  the  Fourier  modes  can  then  be  written  as 

V  •  uq  =  0  in  17,  (5.16a) 


dtuq  =  N,(m)  -  -Vpq  +  -J-V2tt9 
p  tie 


in  17. 


(5.16b) 


The  nonlinear  advection  term  provides  the  coupling  between  all  modes.  We 
can  denote  this  term  by 

rL 

N  q(u)  =  L~1  N(«)e_i(2w//')9"  dz.  (5.16c) 

Jo 

Dissipation  becomes  important  at  wavenumbers  /3d  ~  Re1/2;  at  wavenumbers 
(3  >  Pd  the  equations  are  dominated  by  viscosity.  These  high- wavenumber 
modes  contribute  little  to  the  dynamics  of  the  flow  at  large  scales  because 
their  energy  is  rapidly  dissipated  by  viscosity.  For  an  adequate  description  of 
the  dynamics  in  a  system  with  a  given  spanwise  dimension  L  we  only  need 
a  finite  set  of  M  Fourier  modes  to  cover  the  range  of  scales  from  /3  =  0  (the 
mean  flow)  to  Pd  —  ( 2n/L)M  ~  Re1/2,  or  M  =  O/LRe1/2).  We  take  as  our 
final  representation  of  the  velocity  field  the  truncated  expansion 


M 

u{x,y,z,t)=  ^  uq(x,y,t)e'{2w/L)qz. 
q=-M 


Writing  the  equations  in  Fourier  space  reduces  the  problem  for  a  three- 
dimensional  flow  to  a  sequence  of  coupled  two-dimensional  problems.  The 
only  coupling  is  through  the  nonlinear  term  which  is  again  evaluated  explic¬ 
itly.  Computationally  it  is  more  convenient  to  follow  the  evolution  of  the  two- 
dimensional  Fourier  modes  uq(x,y,t )  than  the  full  three-dimensional  field 
u(x,  t).  Because  u  is  real,  the  Fourier  modes  satisfy  the  symmetry  U-q  =  u*q. 
Therefore,  only  half  of  the  spectrum  ( q  >  0)  is  needed.  In  addition  to  con¬ 
venience,  the  Fourier  representation  of  the  velocity  field  has  other  intrinsic 
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advantages.  It  provides  a  direct  way  of  linking  particular  modes  of  the  sys¬ 
tem  with  specific  three-dimensional  spatial  patterns.  Linear  stability  theory 
can  predict  which  modes  will  have  the  strongest  interaction  with  the  two- 
dimensional  flow  to  produce  these  patterns.  The  time-averaged  amplitude  of 
the  Fourier  modes  gives  a  direct  indication  of  how  well-resolved  the  calcu¬ 
lations  are.  And  finally,  the  time-dependent  amplitude  of  the  Fourier  modes 
provides  a  convenient  way  of  explaining  the  transfer  of  energy  to  different 
scales  in  a  three-dimensional  flow. 

The  comments  in  the  previous  section  about  solving  the  pressure  equation 
apply  only  to  the  mean  flow  (/?  =  0)  of  a  periodic  three-dimensional  flow.  All 
other  wave  numbers  determine  fluctuations  about  the  mean  and  are  uniquely 
defined.  The  same  techniques  described  for  solving  the  two-dimensional  prob¬ 
lem  can  be  applied  to  the  pressure  system  for  /3  =  0. 


5.2  Examples 


Next  we  present  a  variety  of  examples  that  illustrate  the  versatility  of  spectral 
element  methods  for  two  and  three-dimensional  flow  problems.  For  most  cases 
we  show  results  using  both  quadrilateral  and  triangular  spectral  elements.  We 
also  examine  problems  where  nonconforming  elements  and  adaptive  mesh 
refinement  are  used  to  automatically  generate  an  appropriate  discretization 
that  achieves  a  prescribed  error  tolerance. 


Wannier  flow  The  first  example  is  an  exact  solution  to  the  Stokes  equations 
(N (u)  =0,  Re  =  1),  but  for  a  relatively  complicated  flow  with  curvilinear 
boundaries.  It  is  an  exact  solution  derived  by  Wannier  [82]  for  the  creeping 
flow  past  a  rotating  circular  cylinder  next  to  a  moving  wall.  The  solution 
depends  only  on  the  cylinder  radius,  r,  its  rate  of  rotation,  w,  the  distance 
from  the  center  of  the  cylinder  to  the  moving  wall,  d,  and  the  velocity  of  the 
wall,  U.  For  convenience  we  define  s2  =  d2  —  r2  and  JT  =  (d  +  s)/(d  —  s) ,  and 
the  constants: 

o0  =  U/lnT, 
ai  =  -d(a0  +  |  r2w/s), 
a2  -  (d  +  s)(a0  +  | r2uj/s), 
a3  =  (d-  s)(a0  +  \r2uj/s). 

Next  we  define  the  following  functions  that  depend  on  position  (x,y): 

Yi  (y)  =  y  +  d, 

Y2(y)  =  2Y1(y), 

Ki  (x,  y)  =  x2  4-  (s  +  Yi  (y))2 , 

K2(x,y)  =  x2  +  (s  -  Yi(y))2 . 
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In  terms  of  these  quantities,  the  solution  can  be  written  as: 


ui{x,y)  =  U -2(ai  +  a0Y1) 

-aoMKi/K2) 

[s  +  F2- 


S  +  Y!  8- Y! 


«2 

Ki 


K  i  K2 

C s  +  Y1)2Y2 


a  3 
>2 


s-y2  + 


K  i 

(« -  Yi)2y2 


#2 


u2{x,y) 


—^a1+a0Y1)(K2-K1) 
xa2 (s  +  Yi  )y2  za3 (s  -  Y )y2 

k! 


This  problem  was  solved  using  nonconforming  quadrilaterals  and  trian¬ 
gular  spectral  elements.  Figure  5.1  shows  the  corresponding  computational 
domains  along  with  streamlines  of  the  steady-state  solution.  Since  the  ex¬ 
act  solution  is  known,  Dirichlet  boundary  conditions  for  the  velocity  can 
be  applied  along  the  perimeter  of  the  domain.  The  nonconforming  mesh  of 
quadrilateral  elements  incorporates  some  local  refinement  near  the  cylinder 
and  uses  a  total  of  K  =  40  elements.  The  triangular  mesh  uses  K  =  65  ele¬ 
ments  to  discretize  the  same  region  of  space  but  with  higher  resolution  near 
the  cylinder. 

We  note  a  few  items  about  the  calculation  using  triangular  elements. 
Since  this  mesh  uses  curvilinear  elements  around  the  cylinder,  it  serves  to  test 
the  convergence  of  the  method  on  distorted  grids.  All  elements  are  mapped 
to  the  standard  triangle  when  performing  integration.  Because  of  their  de¬ 
formed  nature,  the  Jacobian  is  not  constant  within  a  curvilinear  element.  A 
non-constant  Jacobian  destroys  the  sparsity  of  the  interior-interior  coupling 
submatrices  in  the  global  mass  matrix  and  global  stiffness  matrix.  Neverthe¬ 
less,  since  there  are  only  a  few  of  these  elements  performance  is  not  noticeably 
affected  [75]. 

Figure  5.2  shows  the  results  from  a  p-convergence  study  for  this  flow. 
The  figure  shows  the  H 1  error  in  the  computed  velocity  field.  As  expected 
for  a  smooth  solution,  the  simulations  converge  exponentially  to  the  exact 
velocity  field.  Although  the  quadrilateral  and  triangular  elements  converge 
at  approximately  the  same  rate,  the  actual  value  of  the  error  for  a  given 
order  p  depends  on  how  elements  are  distributed  in  the  domain.  This  results 
in  parallel  convergence  curves  with  different  prefactors  that  depend  on  the 
specifics  of  the  grid. 

Because  the  exact  solution  is  known,  this  problem  also  makes  a  good  test 
case  for  adaptive  mesh  refinement.  Figure  5.3  shows  a  convergence  plot  for 
the  Wannier  flow  solved  by  adaptively  refining  an  initial  coarse  grid.  This 
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Fig.  5.1.  Wannier  flow,  an  exact  solution  for  creeping  flow  past  a  rotating  circular 
cylinder  near  a  moving  wall:  (o)  streamlines  of  the  exact  solution  corresponding  to 
the  parameters  r  =  0.25,  d  =  0.5,  U  =  1,  and  w  =  2;  computational  domain  dis¬ 
cretized  using  (6)  K  =  40  quadrilateral  spectral  elements  and  (c)  K  =  65  triangular 
spectral  elements. 
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Fig.  5.2.  Convergence  of  the  velocity  field  in  the  H 1  norm  to  the  exact  solution  for 
Wannier  flow  shown  in  the  previous  figure:  (□)  quadrilateral  and  (A)  triangular 
spectral  elements.  Note  that  errors  have  been  normalized  by  the  domain  size. 


mesh  index 

Fig.  5.3.  Adaptive  solution  to  the  Wannier  flow  problem:  (•),  computed  L 2  error 
in  the  velocity  field;  ( — ),  10  x  e  where  e  is  the  error  estimated  from  the  trace  of 
the  polynomial  spectrum;  the  dashed  line  is  the  prescribed  error  tolerance.  Meshes 
Mo  and  M5  are  shown  above  the  plot. 


2  4  6  s  10 

p  (polynomial  order) 


Adaptive  Spectral  Element  Methods  291 


results  in  convergence  via  adaptive  ^-refinements  of  the  initial  mesh.  We 
use  the  trace  of  the  polynomial  spectrum  to  drive  adaptivity,  and  apply  the 
refinement  criteria  to  each  component  of  the  velocity  Vector. 

The  calculations  are  performed  as  follows.  We  start  by  solving  the  Stokes 
equations  on  the  initial  coarse  mesh,  designated  Mq.  This  mesh  is  then  refined 
to  meet  a  prescribed  value  of  the  refinement  parameter  e.  A  new  solution  is 
computed  and  the  actual  L2  error  is  compared  to  the  new  estimate.  The 
process  is  iterated  by  lowering  e  by  a  factor  of  10  each  time.  In  pseudo-code 
the  procedure  looks  like  this: 

do  n  =  1,  5 

set  eps  =  l/10~$n  #  set  tolerance 

refine  if  trace(ul)  >  $eps  #  update  grid 
refine  if  trace (u2)  >  $eps  # 
solve (ul,u2)  #  update  solution 

end 

This  produces  a  sequence  of  grids  Mi ,  M2, . . .  ,  M5.  Only  the  first  and  last 
grids  are  shown  in  Fig.  5.3. 

There  are  two  curves  related  to  the  error  estimates  shown  in  Fig.  5.3. 
Look  at  the  results  for  grid  M2.  The  dashed  line  indicates  the  precribed  error 
tolerance  used  to  generate  that  grid.  This  is  an  a  priori  error  estimate  in 
the  sense  that  the  adaptive  procedure  refines  grid  M\  — >  M2  until  the  new 
tolerance  has  been  met,  but  before  a  new  solution  is  available.  The  solid  line 
indicates  the  estimated  error  on  the  new  grid  after  the  solution  has  been 
regenerated.  This  is  an  a  posteriori  estimate.  Finally,  the  symbol  indicates 
the  true  L2  error  on  the  new  grid.  The  error  estimate  is  sharp  in  the  sense 
that  it  follows  the  true  error  to  within  a  constant  factor,  although  for  this 
problem  that  constant  is  «  10. 

Kovasznay  flow  In  1948,  Kovasznay  solved  the  problem  of  steady,  laminar 
flow  behind  a  two-dimensional  grid  [54].  This  exact  solution  to  the  Navier- 
Stokes  equations  is  given  by: 

ui(x,y )  =  1  —  eXx  cos27ry, 
u2(x,y)  =  ^-eAx  sin  2ny, 

p(x,y)  =  i(l-eAx)-i -c, 

where  A  =  Re/ 2—  (Re2/ 4  +  47t2)5  and  cis  an  arbitrary  constant.  We  we  look 
at  the  solution  for  Re  =  40. 

The  Kovasznay  flow  pattern  is  similar  to  the  low-speed  flow  of  a  vis¬ 
cous  fluid  past  an  array  of  cylinders.  Figure  5.4  shows  streamlines  of  the 
steady  solution  and  computational  domains  using  quadrilateral  and  triangu¬ 
lar  spectral  elements.  The  exact  solution  was  used  to  apply  Dirichlet  bound¬ 
ary  conditions,  and  the  Navier-Stokes  equations  were  integrated  to  obtain 
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a  steady-state  solution  on  the  interior  of  the  domain.  Figure  5.5  shows  the 
results  of  a  p-convergence  study  for  this  problem.  Again  we  observe  exponen¬ 
tial  convergence  of  the  solution  in  both  methods,  and  at  roughly  the  same 
rate. 


(b) 


Fig.  5.4.  Kovasznay  flow,  an  exact  solution  to  the  Navier-Stokes  equations:  (a) 
streamlines  of  the  exact  solution  corresponding  to  Re  =  40;  computational  do¬ 
main  discretized  using  (6)  K  =  8  quadrilateral  spectral  elements  and  (c)  K  =  16 
triangular  spectral  elements. 


Next  we  consider  solving  this  problem  using  adaptive  mesh  refinement 
and  nonconforming  elements.  Figure  5.6  shows  the  results  from  the  same 
type  of  convergence  study  that  was  presented  in  Sect.  5.2  for  Wannier  flow. 
In  this  problem  the  error  estimate  differs  from  the  true  L2  error  by  about 
a  factor  of  4.  Note  that  the  initial  grid  is  so  coarse  that  the  estimate  is 
completely  unreliable  —  the  estimated  error  on  the  refined  mesh  Mi  is  higher! 
This  emphasizes  that  fact  that  the  adaptive  procedure  should  really  only  be 
applied  to  a  solution  that  is  well-resolved  in  the  sense  that  ||«h||  ~  ||u||.  That 
assumption  is,  after  all,  at  the  heart  of  the  error  estimate.  Also  note  that  the 
refinement  Mi  -4  M2  produces  such  a  large  drop  in  the  error  that  the  next 
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Fig.  5.5.  Convergence  of  the  velocity  field  in  the  H 1  norm  to  the  exact  solution 
for  Kovasznay  flow  as  shown  in  the  previous  figure:  (□)  quadrilateral  and  (A) 
triangular  spectral  elements. 
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p  (polynomial  order) 


iteration  M2  -»■  M3  does  not  change  it  at  all.3  Although  this  problem  is 
relatively  simple,  it  gives  some  additional  confidence  that  the  error  estimate 
provides  a  meaningful  measure  of  the  approximation  error  even  for  a  fully 
nonlinear  problem. 


Lid-driven  cavity  The  first  two  examples  provide  good  benchmarks  be¬ 
cause  they  are  exact  solutions  to  the  fluid  equations.  However,  the  solutions 
are  really  too  simple  to  warrant  the  use  of  adaptive  mesh  refinement.  The 
next  example  clearly  does  benefit  from  the  use  of  an  automatic  procedure  to 
construct  appropriate  grids. 

Consider  the  case  of  a  lid-driven  cavity  (LDC).  The  flow  within  the  cavity 
is  driven  from  above  by  a  lid  moving  with  unit  velocity,  and  the  problem  is 
non-dimensionalized  so  that  the  cavity  has  unit  length  on  each  side.  Boundary 
conditions  for  the  velocity  are  (iti,U2)  =  (1,0)  along  the  top  boundary  and 
(iti ,  U2)  =  (0, 0)  on  the  three  remaining  sides.  Note  that  the  velocity  boundary 
conditions  are  discontinuous  at  the  corners,  making  this  an  extremely  difficult 
problem  to  resolve  with  a  high-order  method.  It  is  one  of  the  situations  where 
p-refinement  degenerates,  ruling  that  out  as  a  practical  way  to  resolve  the 
flow. 

3  If  you  look  at  Fig.  5.6  and  think  the  actual  L2  error  increases  from  M2  to  M3, 
then  you  are  seeing  an  optical  illusion  caused  by  the  fact  that  the  error  estimate 
decreases  slightly. 
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mesh  index 

Fig.  5.6.  Adaptive  solution  to  the  Kovasznay  flow  problem:  (•),  computed  L2  error 
in  the  velocity  field;  (• — ),  4  x  e  where  e  is  the  error  estimated  from  the  trace  of  the 
polynomial  spectrum;  the  dashed  line  is  the  prescribed  error  tolerance.  Meshes  Mo, 
Mi,  and  Ms  are  shown  above  the  plot. 
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We  will  look  at  two  types  of  parameter  variation  for  this  problem:  nu¬ 
merical  convergence  as  e  ->  0  for  fixed  Re  =  1000,  and  evolution  of  the  grid 
with  increasing  Re  for  fixed  e  =  1CT6.  All  of  these  calculations  will  use  a 
fixed  polynomial  order  of  p  =  7  and  apply  the  trace  of  the  polynomial  spec¬ 
trum  as  a  refinement  criteria.  Note  that  temporal  refinement  is  necessary  as 
well  —  a  suitable  time  step  is  chosen  for  each  new  domain  so  that  the  time 
integration  remains  stable.  The  time  step  in  these  calculations  varies  from 
At  =  0.01  to  At  =  0.000625.  Because  we  are  using  an  implicit  method  to 
solve  the  Navier-Stokes  equations,  local  time  stepping  is  not  an  option. 

First  consider  the  problem  of  computing  the  steady-state  LDC  flow  at  a 
fixed  value  of  Re  =  1000.  Figure  5.7  shows  the  adaptively  generated  grid  and 
corresponding  vorticity  field  for  different  values  of  the  refinement  parameter 
e.  The  initial  coarse  grid  for  this  calculation  was  simply  the  unit  square.  Sev¬ 
eral  intermediate  grids  were  generated  prior  to  the  one  shown  in  Fig.  5.7(a). 
The  refinement  procedure  proceeds  in  a  similar  manner  to  that  described 
previously:  the  solution  is  integrated  for  a  specified  amount  of  time,  then 
the  refinement  criteria  is  applied  to  the  components  of  the  velocity  field  to 
produce  a  new  grid.  The  old  solution  is  projected  onto  the  new  grid  and  the 
next  iteration  begins. 

The  solution  shown  in  Fig.  5.7(a)  with  e  =  10“3  is  quite  coarse  and  clearly 
a  poor  approximation  to  anything  resembling  the  vorticity  of  a  real  flow.  The 
next  adaption  ( e  =  10-4)  refines  the  entire  domain  one  level  and  attempts 
to  resolve  the  shear  layers  along  the  upper  and  right  walls.  At  e  =  10-5  it 
picks  out  high  vorticity  regions  along  the  left  and  bottom  walls  and  contin¬ 
ues  to  refine  the  shear  layers  that  emerge  from  each  upper  corner.  Finally, 
at  e  =  10-6  the  interior  of  the  cavity  is  refined  uniformly  and  a  fine  grid  is 
generated  near  each  corner  and  in  the  direction  just  downstream.  This  pro¬ 
cess  could  be  continued  to  achieve  an  arbitrarily  high  degree  of  accuracy  but 
the  solution  in  Fig.  5.7(d)  is  certainly  a  good  approximation  to  the  flow  at 
this  Reynolds  number.  Keep  in  mind  that  the  vorticity  is  a  derived  quan¬ 
tity  obtained  by  differentiating  the  velocity  field.  It  is  not  even  continuous 
in  this  approximation,  although  continuity  of  higher  derivatives  is  obtained 
as  part  of  the  convergence  process.  For  example,  compare  Figs.  5.7(a)  and 
5.7(d).  It  is  comforting  that  the  refinement  criteria  applied  to  the  velocity 
field  automatically  picks  out  the  physically  important  features  of  the  flow. 

We  can  use  the  same  ideas  to  study  how  the  flow  evolves  with  changes 
in  Re.  At  a  given  value  of  Re  we  use  the  adaptive  procedure  to  generate 
a  steady-state  solution  with  a  prescribed  tolerance  e.  That  solution  serves 
as  an  initial  guess  for  the  next  value  of  Re.  The  adaptive  procedure  keeps 
the  solution  well-resolved  as  the  flow  develops  more  complex  structure  with 
increasing  Re. 

Figure  5.8  shows  the  evolution  of  the  flow  and  adaptively  generated  grids 
for  this  kind  of  parameter  study.  At  low  Reynolds  number  the  cavity  con¬ 
tains  a  diffuse  vorticity  field.  Vorticity  becomes  more  concentrated  along  the 
walls  with  increasing  Reynolds  number.  The  adaptive  procedure  tracks  these 
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Fig.  5.8.  Parameter  study  in  Re  using  adaptive  grids  generated  to  a  tolerance  of 
e  =  10-6:  (a)  Re  =  10;  (&)  Re  =  100;  (c)  Re  =  250;  (d)  Re  =  500.  All  calculations 
used  a  fixed  polynomial  order  of  p  =  7. 
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changes  and  refines  the  grid  to  an  appropriate  level  at  each  value  of  Re.  Note 
that  the  corner  region  is  refined  to  the  same  level  at  Re  =  10  and  Re  =  1000. 
This  is  because  the  nature  of  the  boundary  condition-induced  singularity  is 
independent  of  Re,  as  opposed  to  the  physically  important  behavior  that 
emerges  as  dissipation  is  removed  from  the  system. 

Although  there  is  obviously  no  exact  solution  for  this  problem,  we  can 
compare  with  high-resolution  numerical  simulations  of  the  same  flow  to  demon¬ 
strate  that  the  adaptive  procedure  produces  an  accurate  approximation.  In 
a  recent  study,  Botella  and  Peyret  [16]  compute  solutions  to  the  LDC  flow 
at  Re  =  1000  using  a  Chebyshev  collocation  method.  To  improve  the  accu¬ 
racy  of  the  calculations  they  use  an  analytic  approximation  to  subtract  off 
the  singular  part  of  the  solution  near  the  corners  and  compute  the  remain¬ 
ing  smooth  part  numerically.  By  explicitly  removing  the  singular  part  of  the 
solution  they  can  recover  spectral  accuracy  and  exponential  convergence.  In 
contrast,  the  calculations  presented  here  attempt  to  “resolve”  the  singularity 
directly  through  mesh  refinement  near  the  corners. 

Figure  5.9  compares  profiles  of  the  u-  and  u-components  of  velocity  along 
the  centerline  of  the  cavity.  Data  for  the  comparison  is  taken  from  tables  9 
and  10  of  Botella  and  Peyret  [16].  These  values  correspond  to  calculations 
with  N  =  160  Chebyshev  modes  in  each  direction,  or  25  600  grid  points.  The 
spectral  element  data  corresponds  to  figure  5.7(d);  this  mesh  has  K  x  N2  m 
7500  grid  points.  The  comparison  shows  that  the  two  calculations  are  in 
extremely  close  agreement,  and  demonstrates  that  the  adaptive  procedure 
results  in  a  highly  accurate  solution  for  small  e  with  an  intelligent  distribution 
of  element  size. 


x 


Fig.  5.9.  Comparison  of  velocity  profiles  through  the  center  of  the  cavity  (xc  = 
yc  =  |)  at  Re  =  1000:  •,  spectral  results  from  Botella  &  Peyret  (1998);  — ,  adaptive 
spectral  element  calculation  with  tolerance  e  =  10~6. 
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Additional  information  on  adaptive  spectral  element  calculations  of  the 
lid-driven  cavity  problem  can  be  found  in  [60],  including  the  use  of  directional 
splitting  as  an  efficient  way  to  refine  elements  in  regions  where  the  flow  may 
be  resolved  in  one  direction  but  under-resolved  in  another. 


Fig.  5.10.  Unsteady  forces  on  an  impulsively  accelerated  NACA  0012  airfoil  at 
a  =  22.5  degrees  and  Re  =  852.  Points  (o)  indicate  refinement  steps  during  the 
simulation. 


Impulsively  accelerated  airfoil  Next  we  look  at  an  application  of  this 
technique  to  an  unsteady  flow  problem.  Consider  the  motion  of  an  airfoil 
that  is  set  at  an  angle  of  attack  a  and  impulsively  accelerated  into  a  still 
fluid.  Dimensional  parameters  are  the  chord  length  c,  the  airfoil  acceleration 
a,  and  the  kinematic  viscosity  of  the  fluid  v.  From  these  parameters  we  need 
to  choose  a  length  scale  L  and  a  velocity  scale  U.  The  fluid  motion  satisfies  the 
incompressible  Navier-Stokes  equations  which  we  will  solve  in  a  non-inertial 
reference  frame  attached  to  the  accelerating  airfoil.  In  non-dimensional  form 
the  governing  equations  are: 

V  •  u  =  0,  (5.17) 

dtu  =  N(u)  -  ^Vp+  ^-V2u  -  -^a.  (5.18) 

Note  that  o  =  —a  (cos  a,  sin  a)  is  the  frame  acceleration.  A  natural  and 
obvious  choice  for  the  reference  scales  is  to  normalize  for  unit  acceleration  by 
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taking  L  =  c  and  U  =  sfac.  Prom  these  we  can  also  form  a  time  scale 
T  =  L/U  =  y/c/a.  The  similarity  variable  or  Reynolds  number  is  then 
Re  =  y/ac3/v,  and  the  problem  is  completely  specified  by  prescribing  a  and 
Re.  These  equations  can  be  integrated  using  the  same  technique  described 
in  Sect.  5.1  by  including  the  frame  acceleration  in  the  integration  of  the 
nonlinear  terms  and  the  pressure  boundary  conditions. 

The  parameters  for  this  calculation  correspond  to  a  companion  set  of 
experiments  conducted  at  Galcit  for  a  NACA  0012  airfoil  in  water  [33]. 
The  airfoil  sits  at  an  angle  of  attack  a  =  22.5  degrees  and  the  Reynolds 
number  is  set  to  Re  =  852  to  match  the  experimental  setup.  Note  that  the 
free-stream  velocity  increases  with  time  as  U0 0  =  at  owing  to  the  constant 
acceleration.  The  angle  of  attack  and  Reynolds  number  are  set  to  large  values 
so  that  the  flow  over  the  airfoil  separates  almost  immediately  and  produces 
a  complex  vorticity  field  just  above  the  upper  surface. 

The  problem  was  solved  on  a  large  computational  domain  with  order  p  =  7 
elements.  The  initial  grid  of  Km  140  elements  was  built  ‘by  hand’  to  provide 
a  sufficiently  accurate  discretization  for  starting  the  adaptive  procedure.  As  a 
metric  for  adaption  we  required  the  vorticity  field  £  =  V  x  u  to  be  represented 
on  the  computational  grid  with  a  discretization  tolerance  of  e  =  0.01  for 
the  local  polynomial  spectrum.  In  this  case  we  are  applying  the  refinement 
criteria  to  a  physically  important  derived  quantity  rather  than  one  of  the 
primitive  variables.  Adaption  steps  were  carried  out  at  constant  time  intervals 
of  AT  =  0.1  during  the  integration  from  t  =  0  to  t  =  2.4. 

Figure  5.10  shows  the  unsteady  loading  on  the  airfoil  as  it  accelerates.  The 
solid  line  in  this  figure  connects  the  force  computed  at  each  time  step  in  the 
simulation.  The  points  indicate  the  discrete  times  when  the  grid  is  adapted 
to  maintain  resolution.  This  figure  is  shown  primarily  to  document  that  the 
adaptive  procedure  evolves  smoothly  and  does  not  produce  discontinuous 
jumps  in  the  loading  on  the  airfoil. 

The  developing  vorticity  field  is  shown  in  Fig.  5.11.  The  airfoil  leaves 
a  weak  starting  vortex  in  its  wake  and  rapidly  develops  a  strong  region  of 
separated  vorticity  along  the  upper  (opposite  to  the  direction  of  acceleration) 
surface.  The  refinement  criteria  maintains  a  sharp  resolution  of  the  vorticity 
field  at  all  times.  During  the  course  of  the  calculations  the  number  of  active 
elements  in  the  mesh  increases  from  K  «  180  to  K  &  480,  giving  a  total  of 
«  30720  grid  points  in  the  final  mesh.  The  most  aggressive  mesh  refinement 
takes  place  early  in  response  to  the  strong  vorticity  layer  near  the  leading  edge 
of  the  airfoil  and  the  singularity  produced  by  the  sharp  trailing  edge.  Similar 
to  the  LDC  flow  described  in  Sect.  5.2,  the  singularity  along  the  boundary 
requires  the  most  attention  from  the  adaptive  procedure.  Once  these  parts 
of  the  flow  are  resolved  there  are  relatively  few  additional  refinements  to 
maintain  resolution  of  the  separated  vorticity  field.  To  keep  the  integration 
stable  the  time  step  is  reduced  by  about  two  orders  of  magnitude  to  a  final 
value  of  At  ss  7.5  x  10~5.  This  maintains  a  relatively  constant  CFL  number 
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Fig.  5.12.  Plot  of  St  versus  Re  for  the  flow  past  a  circular  cylinder.  Experiments:  o, 
Williamson  [87];  •,  Hammache  &  Gharib  [38];  3D  simulations:  +,  Henderson  [41]. 
The  solid  line  is  a  curve  fit  to  two-dimensional  simulation  data  for  Re  up  to  1000  [41]. 


during  the  simulation.  This  necessary  reduction  in  At  is  due  to  a  combination 
of  the  mesh  refinement  and  the  increasing  free-stream  velocity  U0 c. 

A  detailed  comparison  of  the  computational  and  experimental  results  for 
this  problem  are  the  subject  of  current  work. 

Cylinder  wake  Understanding  the  fluid  flow  around  a  straight  circular 
cylinder  is  one  of  the  most  fundamental  problems  in  fluid  mechanics.  It’s  a 
model  for  flow  around  bridges,  buildings,  and  many  other  non-aerodynamic 
objects.  Recent  work,  both  experimental  and  computational,  has  revealed 
some  exciting  new  information  about  the  nature  of  this  flow  including  in¬ 
tricate  three-dimensional  structures  that  emerge  just  prior  to  the  onset  of 
turbulence  in  the  wake.  In  this  section  we  describe  spectral  element  calcu¬ 
lations  of  the  two-dimensional  flow  and  then  pick  it  back  up  in  Sect.  6.3  to 
look  at  methods  for  studying  the  subsequent  transition  to  turbulence. 

The  system  considered  is  an  infinitely  long  cylinder  placed  perpendicular 
to  an  otherwise  uniform  open  flow.  The  sole  parameter  for  this  system  in  then 
the  Reynolds  number:  Re  =  Uood/u,  where  Uoo  is  the  free-stream  velocity  and 
d  is  the  cylinder  diameter.  First  we  describe  some  of  the  physically  important 
behavior  in  this  flow,  and  then  come  back  to  details  of  how  it  can  be  sim¬ 
ulated.  It  helps  to  begin  with  a  ‘road-map’  for  the  sequence  of  bifurcations 
that  take  the  flow  from  simple  to  more  complex  states.  There  are  two  useful 
quantities  to  form  such  a  guide  to  understanding:  the  non-dimensional  shed¬ 
ding  frequency  and  the  mean  drag  coefficient  Cx>.  Both  shedding  frequency 
and  drag  show  distinct  changes  at  the  various  bifurcation  points  of  the  wake 
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Fig.  5.13.  Drag  coefficient  as  a  function  of  Reynolds  number  for  the  flow  past  a 
circular  cylinder.  Experiments:  (o,«),  Wieselsberger  [83];  3D  simulations:  +,  Hen¬ 
derson  [41].  The  solid  line  is  a  curve  fit  to  two-dimensional  simulation  data  for  Re 
up  to  1000  [40]. 


and  can  be  used  as  a  guide  to  interpreting  changes  in  the  wake  structure  and 
dynamics  as  a  function  of  Reynolds  number. 

In  non-dimensional  form  the  shedding  frequency  is  referred  to  as  the 
Strouhal  number.  It  is  defined  as  St  =  fd/uoo,  where  /  is  the  peak  oscil¬ 
lation  frequency  of  the  wake.  The  Strouhal-Reynolds  number  relationship  is 
shown  in  Fig.  5.12.  At  low  Reynolds  number  the  flow  is  steady  (St  =  0)  and 
symmetric  about  the  centerline  of  the  wake.  At  Re i  ~  47  the  steady  flow 
becomes  unstable  and  bifurcates  to  a  two-dimensional,  time-periodic  flow. 
The  shedding  frequency  of  the  two-dimensional  flow  increases  smoothly  with 
Reynolds  number  along  the  curve  shown  in  Fig.  5.12.  Note  that  each  point 
along  the  two-dimensional  curve  represents  a  perfectly  time-periodic  flow 
and  there  is  no  evidence  of  further  two-dimensional  instabilities  for  Reynolds 
numbers  up  to  Re  «  1000.  At  Re 2  ~  190  the  two-dimensional  wake  becomes 
absolutely  unstable  to  long-wavelength  spanwise  perturbations  and  bifurcates 
to  a  three-dimensional  flow  (mode  A).  Experiments  and  computations  indi¬ 
cate  a  further  instability  at  Re^  —  260  marked  by  the  appearance  of  fine  scale 
streamwise  vortices.  We  will  return  to  these  instabilities  in  Sect.  6.3. 

Figure  5.13  shows  the  drag  curve  for  flow  past  a  circular  cylinder  for 
Reynolds  number  up  to  1000.  In  the  computations  the  spanwise-averaged 
fluid  force  F  (t)  is  computed  by  integrating  the  shear  stress  and  pressure 
over  the  surface  of  the  cylinder.  The  ^-component  of  F  is  the  drag,  the  y- 
component  is  the  lift.  Because  Cd  is  determined  from  an  average  over  the 
surface  of  the  cylinder,  it  is  much  less  sensitive  to  changes  in  the  character 
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Fig.  5.14.  Computational  domains  used  for  simulating  the  flow  past  a  circular  cylin¬ 
der.  Each  domain  is  a  subset  of  the  largest.  The  parameters  L0  and  L;  determine 
the  cross-sectional  size,  and  L  determines  the  spanwise  dimension. 
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of  the  wake  at  low  Reynolds  number  than  single-point  measurements  like 
the  shedding  frequency.  The  ‘textbook’  version  of  the  drag  curve  is  generally 
plotted  on  a  log-log  scale  where  the  only  discernible  feature  is  the  drag  crisis 
at  Re  =  O(105).  The  flat  response  of  Cd  to  changes  in  Reynolds  number  is 
compounded  by  the  fact  that  experimental  drag  measurements  are  extremely 
difficult  to  make  at  low  Reynolds  number,  and  subtle  details  of  the  drag  curve 
are  lost  in  the  experimental  scatter.  The  decrease  in  magnitude  of  Cd  in  the 
steady  regime  can  be  fitted  to  a  power-law  curve  and  also  makes  a  sharp  but 
continuous  transition  at  Re\.  Henderson  [40]  gives  the  form  and  coefficients 
for  the  steady  and  unsteady  drag  curves. 

This  problem  is  extremely  challenging  because  it  combines  several  features 
that  are  difficult  to  handle  numerically:  unsteady  separation,  thin  boundary 
layers,  outflow  boundary  conditions,  and  the  need  for  a  large  computational 
domain  to  simulate  an  open  flow.  If  the  computational  domain  is  too  small 
the  simulation  suffers  from  blockage.  This  can  have  a  significant  impact  on 
quantities  like  the  shedding  frequency,  generally  producing  higher  frequencies 
in  the  the  simulations  than  are  observed  in  experiments  [49].  If  resolution  near 
the  cylinder  is  sacrificed  for  the  sake  of  a  larger  computational  domain  then 
the  physically  important  flow  dynamics  may  not  be  computed  accurately. 

Figure  5.14  shows  a  sequence  of  computational  domains  used  to  simu¬ 
late  both  2D  and  3D  wakes  using  nonconforming  quadrilateral  elements  [41]. 
Boundary  conditions  are  imposed  as  follows.  Along  the  left,  upper,  and  lower 
boundaries  we  use  free-stream  conditions:  (ui,U2,us)  =  (1,0,0).  At  the  sur¬ 
face  of  the  cylinder  the  velocity  is  equal  to  zero  (no-slip).  Along  the  right 
boundary  we  use  a  standard  outflow  boundary  condition  for  velocity  and 
pressure: 

p  =  0,  d xUi  =  0. 

Along  all  other  boundaries  the  pressure  satisfies  (5.11). 

These  domains  use  large  elements  away  from  the  cylinder  and  outside  the 
wake  where  the  flow  is  smooth.  Local  mesh  refinement  is  used  to  resolve  the 
boundary  layer,  near  wake,  and  wake  regions  downstream  of  the  cylinder.  In 
this  case  the  refinement  is  done  beforehand  and  the  mesh  is  static.  Clearly 
from  Figs.  5.12  and  5.13  the  simulations  predict  values  of  the  shedding  fre¬ 
quency  and  drag  that  agree  extremely  well  with  experimental  studies  up  to 
the  point  of  3D  transition.  Just  as  important  as  good  agreement  with  ex¬ 
periments,  the  simulation  results  are  independent  of  the  grid  as  shown  by  a 
detailed  h-  and  p-refinement  study  [9]. 

6  Instability,  Transition,  and  Turbulence 

In  the  examples  presented  thus  far  we  have  been  building  towards  more  and 
more  complex  flows.  In  this  final  section  we  consider  methods  for  studying  one 
of  the  most  complex  phenomenon  in  fluid  dynamics:  transition  to  turbulence. 
Applications  in  this  area  are  particularly  demanding  and  a  good  match  to  the 
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low  numerical  dissipation  and  dispersion  errors  offered  by  high-order  meth¬ 
ods,  For  example,  a  physical  instability  may  be  suppressed  in  a  numerical 
method  with  excessive  artificial  viscosity,  or  it  may  be  triggered  prematurely 
by  numerical  dispersion  errors.  Spectral  element  methods  applied  to  problems 
in  transition  and  turbulence  offer  the  additional  ability  so  simulate  geometri¬ 
cally  complex  domains.  This  opens  a  wide  range  of  possibilities  for  studying 
interesting  problems  in  this  area. 

First  we  outline  some  basic  tools  for  computing  linear  and  nonlinear  insta¬ 
bilities  of  a  system  efficiently.  These  tools  build  on  the  high-order  integration 
schemes  outlined  in  Sect.  5.1.  The  discussion  here  is  based  on  the  framework 
for  bifurcation  analysis  presented  by  Tuckerman  &  Barkley  [81];  they  dis¬ 
cuss  additional  analysis  tools  such  as  efficient  methods  for  computing  steady 
states  and  performing  continuation. 


6.1  Linear  stability  analysis 

For  the  sake  of  the  following  discussion,  we  can  write  the  Navier-Stokes 
equations  in  the  ‘schematic’  form: 

dtU  =  N(U)  +  LU,  (6.19) 

where  N ( U )  and  L  U  are  the  operators  defined  previously.  The  velocity  field 
U  represents  the  discretized  solution  vector  whose  dimension  we  denote  by 
M.  We  assume  this  number  is  quite  large,  typically  O(104). 

Exponential  power  method  Now  consider  the  problem  of  determining 
the  linear  stability  of  steady  states.  The  stability  of  U  is  governed  by  the 
eigenvalues  A  of  the  Jacobian  A  =  Ny  +  L: 

(. Nv  +  L)u  =  Xu.  (6.20) 

This  follows  from  the  fact  that  small  perturbations  to  U  evolve  according  to 
the  linearized  stability  equations: 

dtu  =  {Nu  +  L)u  (6.21) 

To  determine  the  stability  of  U  it  is  sufficient  to  know  whether  any  eigenval¬ 
ues  have  positive  real  part.  Additional  information  about  the  leading  parts 
of  the  spectrum  can  also  be  useful,  as  well  as  the  structure  of  the  correspond¬ 
ing  eigenvectors.  In  other  words,  we  would  like  to  know  complete  information 
about  a  few  of  the  leading  eigenpairs.  We  assume  that  the  interesting  systems 
are  all  too  large  to  construct  the  Jacobian  directly  and  compute  all  eigen¬ 
values  and  eigenvectors  via  the  QR  algorithm  (operation  count  0(M3)),  so 
iterative  methods  are  the  key. 

The  basic  iterative  technique  to  compute  selected  eigenpairs  is  the  power 
method.  In  this  method  one  acts  repeatedly  with  the  matrix  A  on  an  arbitrary 
initial  vector  uq  to  produce  the  sequence  of  vectors  un  =  Anuo-  This  sequence 
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approaches  the  dominant  eigenvector,  and  the  sequence  of  Rayleigh  quotients 
\n  =  u^Aun/u^Un  converges  to  the  corresponding  eigenvalue. 

Two  modifications  are  needed  to  make  the  power  method  useful  for  sta¬ 
bility  analysis.  As  stated  above,  we  need  a  few  eigenpairs,  not  just  the  domi¬ 
nant  one.  The  calculation  of  several  eigenpairs  is  accomplished  by  the  Arnoldi 
method  or  one  of  its  variations  [4, 69].  Initially  we  form  the  sequence  uo,  Auo, 
. . . ,  Ak~  ' 1  rto ,  whose  span  defines  the  Krylov  space.  K  is  the  number  of  eigen¬ 
pairs  sought.  These  vectors  are  orthonormalized  to  form  a  basis  iq,  tq,  ■  •  ■ , 
vk  for  the  Krylov  space.  We  define  the  M  xK  matrix  V (i,  k)  =  iq(i)  and  the 
K  x  K  Hessenberg  matrix  H  =  VTAV.  When  H  is  diagonalized,  its  eigen¬ 
values  approximate  K  of  the  eigenvalues  of  A,  and  V  times  its  eigenvectors 
approximate  K  of  the  eigenvectors  of  A. 

The  second  modification  is  to  change  the  region  of  the  complex  plane 
where  eigenvalues  are  sought.  The  dominant  eigenvalues  (those  largest  in 
magnitude)  are  not  of  interest.  These  correspond  to  the  same  exponentially 
decaying  modes  that  motivated  the  use  of  a  semi-implicit  integration  scheme 
in  Sect.  5.1.  We  want  the  leading  eigenvalues,  i.e.  those  with  largest  real  part. 

The  solution  to  the  linearized  stability  problem  (6.21)  is: 

u(t  +  At)  =  eAt(Nu+L}  u(t).  (6.22) 

The  leading  eigenvalues  of  any  matrix  A  are  the  dominant  ones  of  exp(At A) 
for  any  positive  At.  The  time  integration  scheme  developed  for  the  full 
Navier-Stokes  equations  is  readily  available  as  an  approximation  to  (6.22), 
and  this  is  the  connection  to  the  power  or  Arnoldi  method:  acting  with  the 
operator  exp  (At  A)  is  equivalent  to  integrating  the  linearized  equations  over 
one  time  step. 

A  single  change  is  required  in  the  time  stepping  code:  replace  the  func¬ 
tion  that  computes  the  nonlinear  term  N  ( U )  with  an  equivalent  function  to 
compute: 

Nvu  =  (U-V)u  +  (u-  V)U. 

Therefore  it  is  a  simple  matter  to  adapt  the  time  stepping  algorithm  (5.8)  to 
integrate  the  linearized  equations. 


Floquet  stability  analysis  The  exponential  power  method  can  be  easily 
adapted  to  compute  the  stability  of  periodic  orbits  rather  than  steady  states. 
Consider  a  T-periodic  solution  U(tmodT).  The  operator  N[j  appearing  in 
the  linearized  equations  (6.21)  will  also  be  T-periodic,  and  it  is  no  longer 
sufficient  to  look  at  the  eigenvalues  of  the  constant  Jacobian  matrix.  Instead, 
stability  is  determined  by  the  eigenvalues  of  the  operator 

/  fto+T 

B  =  exp  (  J  ( Nu(t ')  +  L)  At' 


(6.23) 
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This  operator  takes  a  small  perturbation  u(to)  and  evolves  it  once  around 
the  orbit  to  give  the  perturbation  at  time  to  +  T.  In  practice  the  action  of  B 
is  computed  by  integrating  (6.21)  over  T/At  time  steps. 

The  eigenvalues  p  of  B  are  known  as  Floquet  multipliers.  For  an  initial 
condition  u(to)  that  is  an  eigenmode  of  B.  the  solution  to  (6.21)  is  of  the 
form 

u(t )  =  u(tmodT)ext,  (6.24) 

where  A  =  log (p/T)  is  called  a  Floquet  exponent  and  u(t  mod  T)  is  called 
a  Floquet  mode.  The  dominant  Floquet  multipliers  (leading  Floquet  modes) 
can  be  computed  by  applying  the  exponential  power  method  to  the  operator 
B. 

Acting  with  Bona  vector  u  means  integrating  the  linear  stability  equa¬ 
tions  over  one  full  period,  which  in  turn  means  knowing  the  base  flow  U  at 
each  time  step.  Because  the  solutions  are  time-periodic,  a  natural  simplifi¬ 
cation  is  to  represent  U  with  a  Fourier  series  in  time  and  only  keep  enough 
modes  to  maintain  a  level  of  accuracy  consistent  with  the  rest  of  the  compu¬ 
tations. 

6.2  Nonlinear  stability  analysis 

The  final  tool  we  need  is  a  means  of  distinguishing  whether  a  bifurcation  is 
subcritical  or  supercritical.  Consider  the  normal  form  for  a  pitchfork  bifur¬ 
cation: 

dtA  =  a(R  -  RC)A  -  qA3,  (6.25) 

where  A  is  the  amplitude  of  the  bifurcating  mode,  R  is  the  control  parameter, 
Rc  is  the  bifurcation  point,  a  is  a  positive  constant  relating  changes  in  R  to 
changes  in  the  leading  eigenvalue,  and  a  (the  Landau  coefficient)  determines 
the  nonlinear  character  of  the  bifurcation.  If  a  >  0  the  bifurcation  is  super¬ 
critical  and  nonlinearity  saturates  the  growth  of  A,  resulting  in  a  continuous 
transition.  If  a  <  0  the  instability  is  subcritical  and  a  sufficiently  strong  per¬ 
turbation  can  trigger  a  nonlinear  instability  even  for  R  <  Rc\  the  transition 
is  discontinuous  and  hysteretic. 

The  critical  task  is  to  determine  the  sign  of  a.  First  we  compute  the  steady 
flow  U  and  the  leading  eigenmode  u  for  R  slightly  above  Rc.  We  then  start 
a  nonlinear  simulation  using  the  initial  condition  U  +  eu  for  some  small  e. 
Choosing  some  parameter  to  represent  the  amplitude  A  of  the  bifurcation,  we 
follow  the  growth  of  A  in  time.  Initially  the  simulation  shows  linear  growth 
consistent  with  a  small  positive  eigenvalue  a(R—Rc)  >  0.  As  the  flow  becomes 
more  nonlinear  the  time  series  will  begin  to  deviate  from  linear  growth,  in 
which  case  it  is  simple  to  estimate  the  value  of  a  directly  from  the  time 
series.  For  a  supercritical  bifurcation  the  amplitude  begins  to  grow  slower 
than  the  linear  rate,  while  for  a  subcritical  bifurcation  it  begins  to  grow 
faster.  Therefore,  the  sign  of  a  can  be  determined  quite  reliably. 
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6.3  Examples 

We  close  with  two  detailed  examples  showing  the  application  of  spectral 
element  methods  to  complex  transition  problems:  flow  over  a  backward-facing 
step  [8,46,47],  and  flow  over  a  circular  cylinder  [9,41,42].  Numerous  other 
applications  can  be  found  in  the  literature,  including:  perturbed  plane  Coutte 
flow  [10],  perturbed  channel  flow  [71],  turbulent  flow  past  a  sphere  [80],  and 
turbulent  flow  over  riblets  [22,23]. 


Backward-facing  step  The  separated  flow  generated  as  fluid  passes  over  a 
backward-facing  step  is  of  interest  for  a  variety  of  reasons.  Firstly,  separated 
flows  produced  by  an  abrupt  change  in  geometry  are  of  great  importance  in 
many  engineering  applications.  This  has  driven  numerous  studies  of  the  flow 
over  a  backward-facing  step  during  the  past  30  years,  e.g.  [3,28].  Secondly, 
from  a  fundamental  perspective,  there  is  a  strong  interest  in  understanding 
instability  and  transition  to  turbulence  in  non-parallel  open  flows.  In  this 
context  the  flow  over  a  backward-facing  step  has  emerged  as  a  prototype  of  a 
nontrivial  yet  simple  geometry  in  which  to  examine  the  onset  of  turbulence  [5, 
45-47, 50] .  Finally,  from  a  strictly  computational  perspective,  the  steady  two- 
dimensional  flow  over  a  backward-facing  step  is  an  established  benchmark  in 
computational  fluid  dynamics.  New  computational  studies  such  as  the  highly 
accurate  stability  computations  considered  help  expand  the  database  for  this 
benchmark  problem. 

The  two-dimensional  linear  stability  of  this  flow  has  been  examined  ex¬ 
tensively  and  is  discussed  in  several  publications  [30,31,36].  However,  addi¬ 
tional  computational  evidence  supports  the  existence  of  a  local  convective 
instability  (again  to  two-dimensional  disturbances)  for  a  sizable  portion  of 
the  domain  at  Re  >  525  [47].  In  spite  of  the  numerous  investigations  of  flow 
over  a  backward-facing  step  available  in  the  literature,  two  of  the  most  basic 
questions  for  this  flow  remain  open:  in  the  ideal  problem  with  no  sidewalls,  at 
what  Reynolds  number  does  the  two-dimensional  laminar  flow  first  becomes 
linearly  unstable,  and  what  is  the  nature  of  this  instability?  These  are  the 
questions  we  wish  to  address. 


ah 

h 


(l+a)h 


Fig.  6.15.  Flow  geometry  for  the  backward-facing  step.  The  origin  of  the  coordinate 
system  is  at  the  step  edge.  We  take  the  ratio  of  inlet  height  to  step  height  as  a  =  1, 
so  that  the  expansion  ratio  is  1  +  a  =  2. 
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Figure  6.15  illustrates  the  computational  domain  under  consideration  and 
also  serves  to  define  the  geometric  parameters  for  the  problem.  We  consider  a 
step  of  height  h  and  take  the  edge  of  the  step  as  the  origin  of  our  coordinate 
system.  Fluid  arrives  from  an  inlet  channel  of  height  ah  and  flows  down¬ 
stream  into  an  outlet  channel  of  height  (1  +  a)h.  Here  we  fix  a  =  1,  giving  an 
expansion  ratio  (outlet  to  inlet)  of  1  +  a  =  2.  The  inflow  and  outflow  lengths 
Li  and  L0  should  be  large  enough  that  the  results  are  independent  of  these 
parameters.  At  the  inlet,  Li  =  his  sufficient  for  the  range  of  Reynolds  num¬ 
bers  we  consider  [46, 84].  The  required  outflow  length  L0  varies  with  Reynolds 
number  and  must  be  determined  from  a  proper  convergence  study.  Accept¬ 
able  values  for  the  range  of  Re  considered  here  are  15 h  <  L0  <  55 h  [8]. 
Finally  we  take  the  system  to  be  infinitely  large  and  homogeneous  in  the 
spanwise  direction,  i.e.  Lz  =  oo. 


M, 


Fig.  6.16.  Computational  domains  for  simulating  flow  over  a  backward-facing  step. 
Two  subsections  of  mesh  Mi  axe  expanded  to  show  the  internal  distribution  of 
quadrature  points  for  polynomial  order  p  =  7.  To  simulate  a  three-dimensional 
flow  the  solution  is  decomposed  into  M  Fourier  modes  in  the  periodic  spanwise 
direction,  each  computed  on  the  same  two-dimensional  grid  [8]. 


Figure  6.16  shows  a  collection  of  nonconforming  spectral  element  grids 
for  simulating  this  flow  [8].  Each  grid  uses  local  refinement  to  isolate  the 
singularity  induced  by  the  sharp  corner,  and  to  resolve  the  important  recir¬ 
culation  zones  in  the  wake  of  the  step  and  along  the  upper  wall.  The  use  of 
local  mesh  refinement  allows  high-resolution  of  the  critical  regions  in  this  flow 
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Fig.  6.17.  Three-dimensional  structure  of  the  leading  eigenmode.  Contours  indicate 
the  strength  of  the  downstream  component  of  the  perturbation  and  vectors  indicate 
the  spanwise  flow  pattern  at  each  downstream  plane  [8], 


along  with  a  large  computational  domain  that  pushes  the  outflow  boundary 
far  downstream. 

Stability  calculations  for  this  flow  consist  of  two  parts.  First,  the  steady 
state  solution  for  a  given  Re  is  computed  using  either  time-integration  or  New¬ 
ton  methods  [81].  Second,  the  relevant  bifurcation  points  along  the  steady 
branch  of  solutions  are  computed  using  two-  and  three-dimensional  linear 
stability  analysis  based  on  the  iterative  methods  outlined  in  Sect.  6.1.  The 
additional  parameter  for  three-dimensional  stability  calculations  is  the  span- 
wise  wavenumber  ft  of  the  perturbation.  We  define  ft  =  2tt/X  where  A  is  the 
corresponding  wavelength. 

First  consider  the  three-dimensional  stability  of  the  flow.  Figure  6.18 
shows  the  neutral  stability  curve  up  to  Re  =  1000.  Everywhere  to  the  right 
of  the  curve  the  flow  has  a  positive  eigenvalue  and  is  linearly  unstable. 
The  points  were  obtained  by  accurately  finding  zero  crossings  of  eigenvalue 
branches  (as  a  function  of  ft)  for  several  Reynolds  numbers  between  750  and 
1000.  From  Fig.  6.18  it  can  be  seen  that  the  primary  linear  instability  for 
the  backward-facing  step  occurs  very  near  Re  =  750.  The  instability  is  three 
dimensional  with  a  streamwise  wavenumber  ft  «  0.9. 

The  three-dimensional  structure  of  the  leading  eigenmode  is  shown  in  Fig.  6.17, 
The  flow  visualization  is  constructed  by  forming  the  linear  superposition 
U  +  eu  of  the  steady  base  flow  and  the  computed  perturbation  field.  The 
structure  of  the  3D  instability  represents  streamwise  vortices  that  originate 
in  the  recirculation  zone  just  downstream  of  the  step.  In  principle,  the  flow 
shown  in  Fig.  6.17  could  be  integrated  forward  in  time  using  the  full  Navier- 
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Stokes  equations  to  determine  the  nonlinear  stability  of  this  flow.  This  is 
complicated  by  the  presence  of  a  strong  convective  instability  [47]  and  has 
not  been  computed  satisfactorily  to  date. 

It  is  also  interesting  to  look  at  the  two-dimensional  stability  of  this  flow. 
Note  that  in  the  limit  /?  -4  0  the  eigenmodes  fall  into  one  of  two  categories, 
either 


u(x,y)  =  (u(x,y),v(x,y),  0), 

(6.26) 

ii(x,y)  =  (0,0  ,w(x,y)). 

(6.27) 

We  shall  refer  to  these  as  type-I  and  type-II  modes  respectively. 


Re 


Fig.  6.18.  Neutral  stability  curve  for  backward-facing  step  flow.  In  the  shaded 
region  the  flow  is  linearly  unstable  at  the  corresponding  Reynolds  numbers  and 
spanwise  wavelengths  [8]. 


Figure  6.19  shows  the  first  three  eigenvalues  corresponding  to  two-dimensional 
modes.  At  low  Reynolds  number  these  modes  are,  in  order  of  decreasing  real 
part,  type-I,  type-II,  and  then  again  type-I.  At  Re  «  1000  the  second  type-I 
eigenvalue  crosses  the  type-II  eigenvalue,  but  they  do  not  merge  because  the 
eigenmodes  are  of  different  type.  A  semi-log  plot  of  the  data  indicates  that  the 
eigenvalues  depend  exponentially  on  Re  over  this  range.  The  corresponding 
exponential  fits  are  shown  in  Fig.  6.19.  Extrapolation  of  these  fits  indicates 
that  the  two  real  eigenvalues  would  cross  at  Re  «  1350.  It  is  thus  likely  that 
the  two  eigenvalues  join  in  a  complex  pair  near  Re  =  1350.  This  is  consistent 
with  two-dimensional  simulations  at  Re  =  1350  which  show  oscillatory  decay 
to  the  two-dimensional  steady  state.  Because  the  exponential  fits  in  Fig.  6.19 


Adaptive  Spectral  Element  Methods  313 


will  not  be  valid  as  the  eigenvalues  approach  one  another,  it  is  impossible 
to  estimate  what  happens  at  higher  Reynolds  numbers  based  on  the  cur¬ 
rent  data.  One  possibility  is  that  the  two-dimensional  linear  instability  for 
this  flow  is  a  Hopf  bifurcation  arising  from  the  joining  of  the  two  eigenvalue 
branches  [8]. 


Re 

Fig.  6.19.  Two-dimensional  stability  results  for  the  backward-facing  step.  Solid 
points  and  hollow  squares  denote  eigenvalues  corresponding  to  type-I  modes  and 
type-II  modes,  respectively  [8]. 


Cylinder  wake  For  our  final  example  we  return  the  problem  of  flow  past  a 
circular  cylinder.  The  range  of  Re  from  about  10  to  1000  shown  in  Figs.  5.12 
and  5.13  represents  the  entire  sequence  of  states  from  steady  laminar  flow  to 
complex  turbulent  flow  for  this  system.  What  we  wish  to  understand  are  the 
secondary  instabilities  corresponding  to  f?e2  and  Re'2  and  how  these  instabil¬ 
ities  drive  the  transition  to  turbulence.  Roshko  first  identified  the  transition 
range  for  flow  past  a  circular  cylinder  as  the  range  of  Re  where  velocity 
fluctuations  become  irregular  [67];  this  is  generally  quoted  at  Re  =  150  to 
300.  Early  flow  visualization  studies  revealed  some  three-dimensionality  in 
this  regime  [32,37],  but  it  was  really  Williamson  who  captured  the  intricate 
structure  of  the  3D  flow  and  demonstrated  the  clear  presence  of  a  finite- 
wavenumber  secondary  instability  [86] .  The  basic  flow  patterns  consist  of  two 
types  of  3D  vortex  shedding  now  referred  to  as  mode  A  and  mode  B.  For  rea¬ 
sons  discussed  below  these  structures  are  fleeting  and  can  only  be  captured 
in  pure  form  on  the  computer.  From  this  point  we  proceed  in  stages,  first 
looking  at  the  linear  and  nonlinear  instabilities  that  produce  these  modes, 
then  mechanisms  by  which  they  interact  to  cause  transition,  and  finally  some 
properties  of  the  ‘turbulent’  flow  at  higher  Re. 
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Linear  stability  theory  is  the  natural  context  for  examining  the  origin  of 
three-dimensionality  in  the  wake  [9,62],  The  linear  stability  problem  deter¬ 
mines  the  structure  and  spatiotemporal  symmetry  of  the  global  modes  and 
the  critical  parameter  values  (Re 2  and  Re^  )  where  they  first  become  unstable 
Once  perturbed  these  modes  are  self-excited  and  cause  transition  to  a  three- 
dimensional  state.  The  symmetry  of  the  wake  after  transition  is  determined 
by  the  spatiotemporal  symmetry  of  the  destabilizing  global  mode. 

Computational  domains  appropriate  for  simulating  the  flow  past  a  cylin¬ 
der  were  shown  in  Fig.  5.14.  Like  the  previous  example,  stability  calculations 
for  this  flow  consist  of  two  parts.  First  we  compute  the  2D  base  flow  cor¬ 
responding  to  the  Karman  vortex  street  by  integration  the  fluid  equations 
until  they  converge  to  a  time-periodic  state.  Second,  we  compute  the  rele¬ 
vant  bifurcation  points  along  the  2D  time-periodic  branch  of  solutions  using 
three-dimensional  Floquet  stability  analysis. 

Figure  6.20  shows  the  neutral-stability  curves  for  the  wake  and  the  two 
regions  of  instability  that  produce  modes  A  and  B.  These  calculations  are 
performed  using  the  stability  methods  outlined  in  Sect.  6.1;  a  detailed  expla¬ 
nation  is  given  in  [9].  The  critical  values  are  Re 2  ~  190  and  A2  ~  3.96d  for 
mode  A,  Re ^  —  260  and  \'2  —  0.822 d  for  mode  B.  Note  that  mode  A  has  a 
relatively  long  wavelength  that  scales  on  the  primary  instability  wavelength, 
i.e.  the  Karman  vortex  spacing  of  A  «  5 d,  while  mode  B  has  a  relatively  short 
wavelength  that  presumably  scales  on  the  thickness  of  the  separating  shear 
layer.  Experimental  measurements  show  exceptional  agreement  with  the  pre¬ 
dicted  maximum  growth  rate  curve  for  mode  A  [88].  Measurements  for  mode 
B  also  cluster  nicely  into  the  predicted  range  of  unstable  wavelengths.  Refer¬ 
ring  back  to  Figs.  5.12  and  5.13  shows  that  the  critical  points  for  the  linear 
instabilities  coincide  with  the  observed  transition  points  in  the  response  of  St 
and  Cd  •  Given  the  complexity  of  the  system  this  is  outstanding  agreement 
for  a  non-trivial  set  of  quantities.  It  is  also  a  triumph  for  linear  stability  cal¬ 
culations  that  reduce  the  complexity  of  the  full  three-dimensional  stability 
problem  to  a  level  that  can  be  run  on  a  workstation. 

Next  we  apply  the  methods  for  nonlinear  stability  analysis  described  in 
Sect.  6.2  to  determine  the  nonlinear  stability  of  mode  A  and  mode  B.  As 
stated,  the  Landau  coefficient  in  (6.25)  can  be  evaluated  from  a  single  time 
series  computed  from  a  full  nonlinear  calculation.  A  convenient  measure  of 
the  amplitude  A  is  the  magnitude  of  the  Fourier  component  corresponding  to 
the  3D  perturbation  [42],  This  analysis  indicates  that  a  —  —0.116  for  mode 
A  (subcritical)  and  a  =  3.92  for  mode  B  (supercritical).  Once  these  coeffi¬ 
cients  are  known  the  steady-state  amplitudes  |A|  and  [f?|  can  be  computed 
explicitly.4  Figure  6.21  shows  this  in  the  form  of  a  bifurcation  diagram  for 
the  two  instabilities.  This  figure  also  includes  additional  DNS  results  that 


4  Because  mode  A  is  subcritical,  the  coefficient  of  the  next-order  term  A5  is  nec¬ 
essary  to  determine  saturation.  This  can  be  estimated  using  the  same  technique 
applied  to  determine  the  Landau  constant  [42], 
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verify  the  validity  of  the  amplitude  model  near  the  critical  points  [41, 42].  Al¬ 
though  these  results  have  not  yet  been  confirmed  directly  by  experiment  they 
are  consistent  with  experimental  observations.  Referring  back  to  Fig.  5.12  we 
see  there  is  good  agreement  in  the  range  of  hysteresis  and  the  computed 
frequency  drop.  The  discontinuous  drop  in  shedding  frequency  is  a  natural 
result  of  the  subcritical  bifurcation  to  mode  A. 


Fig.  6.20.  Regions  of  linear  instability  for  the  cylinder  wake,  neutral  curves  and 
critical  points  (•)  are  from  computations  [9].  Open  symbols  indicate  wavelength 
measurements  from  various  experimental  studies  [58, 88, 90]. 


Figure  6.22  shows  a  visualization  of  the  full  nonlinear  form  modes  A  and 
B  exhibit  at  saturation  in  terms  of  their  streamwise  and  spanwise  compo¬ 
nents  of  vorticity.  This  figure  also  reveals  their  distinct  space-time  symme¬ 
tries.  These  symmetries  are  manifest  in  the  form  of  a  staggered  array  of 
streamwise  vortices  for  mode  A  and  an  inline  array  of  streamwise  vortices  for 
mode  B  [9, 18, 89].  Several  simulations  of  the  three-dimensional  flow  (all  using 
spectral  element  methods)  have  reproduced  the  essential  features  observed  in 
experiment  and  there  is  now  little  doubt  regarding  the  qualitative  structure 
of  modes  A  and  B  [41, 79]. 

Unfortunately  these  states  are  not  observed  in  pure  form  in  the  laboratory. 
In  the  range  Re  «  200  to  260  the  natural  flow  structure  may  be  more  appro¬ 
priately  characterized  as  a  mixed  A-B  state  like  the  one  shown  in  Fig.  6.22c. 
The  relevant  facts  are  the  following.  Velocity  fluctuations  exhibit  broad-band 
frequency  spectra  just  beyond  the  onset  of  mode  A,  and  mode  A  is  in  fact 
only  observed  as  a  transient  in  the  approximate  range  Re  &  180  to  200.  At 
long  times  the  flow  is  highly  irregular.  In  contrast  to  this,  mode  B  is  ob¬ 
served  with  good  regularity  from  Re  «  200  on,  and  as  Re  ->  Re?  there  is  a 
reasonably  well-defined  wavelength  in  the  near  wake  and  a  sharp  peak  in  the 
frequency  spectrum.  However,  this  peak  is  superimposed  over  a  broad  band  of 
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Re 

Fig.  6.21.  Bifurcation  diagrams  for  (upper)  mode  A  and  (lower)  mode  B.  Points 
(•)  indicate  results  from  three-dimensional  simulations  [41,42]. 


frequencies  in  the  background  indicative  of  ‘turbulence’  farther  downstream. 
From  these  observations  we  see  that  the  flow  undergoes  a  fast  transition  to 
a  state  that  may  be  characterized  as  spatiotemporal  (ST)  chaos  at  the  onset 
of  mode  A  rather  than  through  a  sequence  of  further  bifurcations. 

What  are  the  properties  of  the  system  that  would  lead  one  to  expect 
chaotic  behavior?  We  shall  argue  this  in  terms  of  the  spanwise  energy  spec¬ 
trum  shown  in  Fig.  6.23,  spanwise  dimension  L,  excitation  scale  Ie,  and 
dissipation  scale  Id-  ST  chaos  is  a  common  feature  of  systems  where  excita¬ 
tion  occurs  at  a  length  scale  much  smaller  than  the  system  size  but  larger 
than  the  dissipation  scale  ( L  >  Ie  >  Id)-  The  excitation  scale  Ie  ~  A2  is 
fixed  by  the  finite-wavenumber  instability  of  mode  A.  The  subcritical  nature 
of  the  bifurcation  indicates  that  Ie  >  Id  at  onset.  Simulations  indicate  that 
the  dynamics  are  time-periodic  or  quasi-periodic  when  L  ss  A2  so  that  only 
one  or  two  mode  A  instabilities  can  be  excited.  When  L^>  X2  many  A-modes 
are  excited  and  the  simulated  flow  exhibits  ST  chaos  that  is  in  qualitative 
agreement  with  experimental  observations.  The  dynamics  in  this  case  are 
driven  by  the  nonlinear  competition  between  multiple  mode  A  instabilities. 
This  scenario  is  exactly  the  Ruelle-Takens-Newhouse  (RTN)  route  to  tur¬ 
bulence,  a  universal  route  to  turbulence  in  dissipative  systems  that  develop 
three  or  more  incommensurate  modes  of  oscillation  [61,68].  Finally  we  close 
this  example  with  some  observations  of  the  ‘turbulent’  flow  that  develops 
beyond  the  transition  regime.  If  one  accepts  the  definition  of  a  turbulent  flow 
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Fig.  6.22.  Flow  visualization  of  the  three-dimensional  vorticity  field  due  to  sec¬ 
ondary  instability  in  the  wake  of  a  circular  cylinder:  (a)  mode  A  at  Re  =  195,  (b) 
mode  B  at  Re  =  265,  and  (c)  mixed  A-B  state  at  Re  =  265  [41], 


318  Ronald  D.  Henderson 


Fig.  6.23.  Computed  spanwise  energy  spectrum  of  the  cylinder  wake  at  Re  =  265, 
indicating  the  excitation  scale  due  to  mode  A  and  the  dissipation  scale  due  to 
viscosity  [41]. 


Fig.  6.24.  Formation  of  dislocations  in  the  turbulent  cylinder  wake:  (left)  experi¬ 
mental  smoke-wire  visualization  at  Re  =  5500  (Norberg  1992);  (right)  DNS  results 
at  Re  =  1000  (Henderson  1997). 
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as  being  characterized  by  continuous  spatial  and  temporal  spectra,  then  the 
cylinder  wake  is  fully  turbulent  at  Re  =  300.  In  the  classical  view  further 
increasing  Re  pushes  the  system  into  the  regime  of  ‘featureless’  turbulence. 

There  is  at  least  one  additional  interesting  phenomenon  that  occurs  be¬ 
yond  the  transition  regime  that  can  be  identified  as  a  unique  feature  of  the 
flow.  Figure  6.24  shows  a  spanwise  view  of  the  wake  that  reveals  a  set  of 
dislocations  in  the  pattern  of  vortex  shedding.  This  figure  compares  both  ex¬ 
perimental  flow  visualization  and  computer  simulations  with  a  large  spanwise 
dimension  of  L  ~  25.13d  [41,63].  Other  experiments  of  turbulent  flow  past 
a  cylinder  also  show  evidence  of  dislocations  at  Re  as  high  as  105  [15].  At 
high  Re  these  structures  develop  spontaneously  as  long  as  the  aspect  ratio  is 
sufficiently  large. 

Dislocation  events  have  a  distinct  effect  on  the  fluctuation  lift  and  drag. 
Figure  6.25  shows  computed  values  of  Cd  and  Cl  as  a  function  of  the  span- 
wise  dimension  Lai  Re  =  1000.  In  small  systems  the  formation  of  dislocations 
is  suppressed  and  the  unsteady  forces  are  roughly  periodic.  In  large  systems 
Cl  in  particular  appears  in  ‘bursts.’  Minimum  values  of  Cl  occur  during 
the  formation  of  a  dislocation  due  to  phase  differences  along  the  span  of  the 
cylinder.  This  ‘bursting’  phenomenon  is  a  generic  feature  of  high-Jf?e  flow  past 
bluff  bodies  and  is  also  reported  in  experimental  studies  of  flow  past  cylinders 
and  bluff  plates  [56,78]. 


t 


Fig.  6.25.  Unsteady  lift  and  drag  coefficients  for  the  ‘turbulent’  flow  past  a  cylinder 
at  Re  =  1000,  illustrating  the  effect  of  increasing  domain  size. 


A  natural  extension  of  the  computational  results  reported  here  is  to  pursue 
large-eddy  simulation  (LES)  of  the  turbulent  flow  at  higher  Re.  A  better 
understanding  of  the  role  that  large-scale  structures  play  on  the  overall  mixing 
and  dynamics  of  the  flow  is  certainly  necessary  for  this  to  succeed.  Further 
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experience  on  the  application  of  high-order  methods  for  LES  is  also  needed. 
In  particular,  there  are  challenges  related  to  issues  like  proper  filtering  and 
formal  correctness  on  locally  refined  grids. 
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Abstract.  We  present. some  mathematical  foundations  of  hp-FEM  for  fluid  flow 
simulation.  Particular  attention  is  paid  to  the  mesh-design  for  viscous,  incompress¬ 
ible  flow  where  the  regularity  of  the  solution  mandates  resolution  of  corner  singu¬ 
larities  and  boundary  layers.  Stabilized  and  discontinuous  hp-FEM  for  advection 
dominated  and  nearly  incompressible  flows  are  derived.  A  new  hp-adaptive  time 
stepping  strategy  for  spectral  accuracy  in  transient  problems  is  presented. 

Table  of  Contents 

1  Introduction . 326 

1.1  General  Remarks . 326 

1.2  Notation . 329 

1.3  Governing  Equations . 330 

2  Model  Problems . 330 

2.1  Reaction-Diffusion . 332 

2.2  Convection . 332 

2.3  Convection-Diffusion . 332 

3  Solution  properties . 333 

3.1  Corner  Singularities  . 333 

3.2  Boundary  layers . 335 

3.3  Viscous  Shock  Profiles . 336 

4  Basic  hp  FEM . 338 

4.1  hp- FE  Spaces  on  patchwise  structured  grid . 339 

4.2  Choice  of  Patch  Meshes  Tp  in  2-d . 341 

4.3  hp- spaces  on  T . 346 

5  hp-Error  Estimates . 347 

5.1  Basic  error  estimates . 347 

5.2  Corner  singularities . 354 

5.3  Zip-Boundary  layer  resolution  . 357 

6  Reaction-Diffusion . 365 

6.1  Standard  (continuous)  Discretization . 365 

6.2  Mixed  discretization . 366 

6.3  Mortar-Discretization . 367 

6.4  Discontinuous  Galerkin  Method  for  second  order  problems  ....  371 

7  Convection . 374 

7.1  Model  convection  problem . 375 


326  Christoph  Schwab 


7.2  The  hp- SDFEM . 375 

7.3  Discontinuous-Galerkin  hp-FEM . 377 

7.4  hp-Error  Analysis  of  the  DG-  and  the  SDFEM . 379 

8  Convection-Diffusion . 382 

8.1  Standard  Galerkin  discretization  . 382 

8.2  Streamline-Diffusion  FEM . 383 

9  Elasticity . 395 

9.1  Basic  equations  . 395 

9.2  Variational  formulation  . 396 

9.3  Standard  continuous  discretization . 397 

9.4  Dual  mixed  formulation . 397 

9.5  Mortar  Discretization . 398 

9.6  Discontinuous  Galerkin  discretization . 399 

10  Incompressibility . 400 

10.1  Basic  Equations . 400 

10.2  Variational  formulation  of  the  Stokes  problem . 401 

10.3  FE-discretization  of  the  Stokes  problem . 401 

10.4  GLS  stabilized  hp-FEM  for  the  Stokes  problem . 403 

10.5  Numerical  experiments . 404 

10.6  Almost  incompressibility . 416 

10.7  Advection  dominated  compressible  (elastic)  flow . 422 

11  hp- time-stepping . 423 

11.1  hp-FEM  for  first  order  transient,  hyperbolic  problems . 424 

11.2  The  DG(r)-FEM  for  nonlinear  initial  value  problems . 426 

11.3  DG(r)-FEM  for  abstract  initial  boundary  value  problems . 428 

11.4  An  example:  Heat-equation . 432 


1  Introduction 


1.1  General  Remarks 

These  lecture  notes  are  intended  as  an  introduction  to  the  subject  of  hp- 
Finite  Element  Methods  with  particular  attention  to  computational  fluid 
dynamics  (CFD)  problems.  We  assume  that  the  reader  is  familiar  with  the 
governing  equations  of  viscous  flow,  both  compressible  and  incompressible,  as 
well  as  with  the  basic  facts  on  hyperbolic  systems  of  conservation  laws.  Good 
references  on  the  analysis  of  the  incompressible  Navier-Stokes  equations  are 
e.g.  [71],  and  for  hyperbolic  conservation  laws  with  particular  attention  to 
numerical  methods  we  mention  [32], 

What  are  hp-FEM?  There  are  at  present  two  dominant  methodologies  in 
CFD  algorithm  design,  spectral  and  Finite  Difference  (FD)/  Finite  Volume 
(FV)  methods.  Spectral  discretizations  in  fluid  dynamics  have  a  long  history, 
see  e.g.  [17]  and  the  references  there.  Spectral  methods  are  typically  based 
on  subdivisions  of  the  domain  in  few,  rather  large  elements  with  high  order 
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polynomial  discretizations  of  the  field  variables.  In  most  cases,  the  partial  dif¬ 
ferential  equations  are  discretized  using  collocation  in  special  sets  of  nodes, 
mostly  the  Chebysev  or  the  Lobatto  nodes.  In  spectral  methods,  convergence 
is  achieved  by  raising  the  order  k  of  the  approximation  rather  than  by  reduc¬ 
ing  the  meshwidth  h,  as  is  done  in  finite  difference  or  classical  finite  element 
methods.  In  FD/FV  methods,  on  the  contrary,  convergence  is  achieved  by 
refining  the  mesh,  possibly  adaptively,  and  by  (adaptively)  reducing  the  or¬ 
der  of  the  scheme  near  discontinuities  using  limiters.  The  resulting  numerical 
schemes  are  nonlinear,  even  when  applied  to  linear  problems.  Unlike  FD/FV 
methods,  the  convergence  order  of  spectral  methods  is  limited  only  by  the 
regularity  of  the  solution  (loosely  speaking  by  the  growth  of  high  order  deriva¬ 
tives  of  the  solution  provided  they  exist).  There  are,  however,  instances  (and 
we  will  discuss  them)  when  high  derivatives  of  the  solution  fail  to  exist  at 
least  in  subdomains  and  in  these  cases  nothing  is  to  be  gained  by  using  very 
high  order  approximations  everywhere  1 

hp-FEM  can  be  viewed  as  a  unification  of  both  ideas  -  in  a  sense,  they 
allow  the  combination  of  (necessarily  anisotropic)  local  mesh  refinement  in 
areas  where  the  exact  solution  lacks  regularity  with  large,  spectral  type  ele¬ 
ments  in  areas  where  the  solution  is  smooth. 

When  to  use  hp-FEM?  hp-FEM  have  been  successful  in  applications  to 
structural  mechanics,  in  particular  in  applications  where  high  accuracy  is 
required  and  where  the  solutions  lack  regularity  locally  due  to  corner  singu¬ 
larities  and/or  the  presence  of  small  parameters  (singular  perturbation  prob¬ 
lems),  see  e.g.  [46],  [59].  As  we  shall  see,  also  in  computational  fluid  dynamics 
the  judicious  application  of  properly  designed  hp-FEM  can  in  many  practi¬ 
cal  situations  deliver  high  resolution  and  exponential  convergence  rates  where 
either  FDM/FVM  or  spectral  methods  would  only  yield  algebraic  rates. 

Let  us  briefly  outline  common  features  and  differences  between  hp-FEM 
and  spectral  and  FD/FV  methods.  hp-FEM  share  with  spectral  element  and 
FV  methods  that  arbitrary  geometries  can  be  discretized  via  parametric  el¬ 
ement  maps.  Unlike  spectral  methods,  hp-FEM  allow  also  for  nonuniform 
distribution  of  the  polynomial  degree  resp.  order  of  accuracy  -  for  example, 
not  only  can  the  mesh  be  locally  refined  near  shocks  but  the  order  of  the 
method  may  also  be  reduced  to  first  order  there.  This  order  reduction  corre¬ 
sponds  to  the  use  of  limiters.  However,  the  resulting  algorithm  is  linear  for 
linear  problems.  This  reduction  to  first  order  in  hp-FEM  does  not  entail  a  loss 
of  overall  exponential  accuracy  if  the  elements  where  first  order  is  used  are 
exponentially  small.  This  is  typically  the  case  provided  we  employ  geometric 
meshes  with  a  number  of  refinement  levels  coupled  to  the  spectral  order  of 
the  elements.  In  the  small  elements  supporting  the  first  order  discretization, 


1  In  these  cases,  the  mathematical  theory  of  n-widths  indicates  that  uniform  mesh 
refinement  with  a  low  order  method  will  give  optimal  convergence  rates  that  can 
at  best  be  matched  but  not  surpassed  by  spectral  methods 
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all  techniques  from  FDM/FVM  for  dealing  with  discontinuous  solutions  can 
be  brought  to  bear. 

As  a  rule,  /ip- FEM  are  based  on  certain  variational  formulations  of  the 
problem  under  consideration.  Discretization  is  performed  by  restricting  in 
these  formulations  the  unknown  physical  fields  to  finite  dimensional  sub¬ 
spaces.  The  design  of  these  subspaces  shall  be  discussed  in  detail  below.  In 
the  derivation  of  /ip-methods,  one  assumes  first  that  integrals  in  the  varia¬ 
tional  formulation  are  evaluated  exactly.  This  is  rarely  possible  in  practice, 
since  for  example  for  curved  elements  and  nonlinearities  some  form  of  numer¬ 
ical  quadrature  is  an  integral  part  of  hp- FE  algorithms.  The  resulting,  fully 
discrete  methods  are  in  essence  hp-spectral  element  methods  sharing  features 
of  hp-FEM  (variational  formulation,  variable  polynomial  degree/spectral  or¬ 
der)  and  of  the  traditional  spectral  methods  (collocation  of  nonlinearities). 

The  pillars  of  any  convergent  numerical  algorithm  are  stability  and  con¬ 
sistency.  Exponential  convergence  rates  with  hp-FEM  require,  as  a  rule,  the 
proper  design  of  the  hp-subspaces,  i.e.  proper  choice  of  the  mesh  and  the 
degree  distribution.  Many  choices  are  usually  possible.  They  can  be  based 
either  upon  the  dominant  solution  phenomena  or  on  adaptive  strategies.  As 
a  rule,  hp-FEM  are  most  efficient  when  highly  anisotropic  elements  are  ad¬ 
mitted,  e.g.  in  boundary  layers  or  viscous  shock  profiles;  since  anisotropic 
adaptive  refinements  are  to  date  still  not  as  well  developed  as  isotropic  ones, 
some  a-priori  mesh- design  with  anisotropic  elements  should  be  performed  in 
the  appropriate  flow  regions  whenever  possible.  The  use  of  body-fitted,  struc¬ 
tured  meshes  is  well-established  in  CFD  and  should  be  kept  with  hp-FEM 
whenever  possible.  Note,  however,  that  Ap-meshes  may  differ  considerably 
from  the  ones  used  with  low  order  methods. 

Unlike  in  solid  mechanics,  variational  formulations  of  fluid  flow  problems 
are  usually  neither  symmetric  nor  coercive  due  to  dominant  transport  effects. 
Therefore  stabilized  variational  formulations  have  to  be  used  to  achieve  sta¬ 
bility  of  FEM  in  the  presence  of  advection.  We  will  discuss  in  detail  the  most 
frequently  employed  formulations  such  as  Galerkin  Least  Squares  (GLS)  and 
the  streamline  diffusion  FEM  (SDFEM)  as  well  as  certain  Discontinuous- 
Galerkin  (DG)  methods.  Such  formulations  are  well  established  in  CFD,  but 
have  to  be  adapted  to  accommodate  hp-FEM. 

These  lectures  aim  at  the  description  of  hp-FEM  with  particular  attention 
to  the  formulation  of  hp-schemes  for  flow  problems  and  their  error  analysis. 
Methodologically,  we  start  by  describing  /ip-FE  discretizations  of  simple  lin¬ 
ear  diffusion  and  transport  processes,  followed  by  Galerkin  schemes  for  in- 
viscid  conservation  laws  and  finally  the  full,  compressible  NSE.  In  each  case, 
we  explain  carefully  the  design  of  meshes  and  order  distributions  which  are 
most  efficient  for  the  resolution  of  specific  flow  phenomena,  such  as  singular¬ 
ities,  boundary  layers  and  viscous  shock  profiles  in  the  context  of  judiciously 
chosen  model  problems.  Likewise,  the  GLS,  SDFEM  and  DG  stabilization 
techniques  for  convection  dominated  problems  will  also  be  discussed  first  for 
such  model  problems. 
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These  notes  are  not  intended  as  a  mathematical  treatise  on  hp- FE  theory, 
they  rather  try  to  give  a  concise  overview  over  variational  formulations  for 
hp- FEM  and  theoretical  convergence  results  that  are  essential  for  efficient 
fluid  flow  simulation.  The  material  presented  is  biased  towards  the  recent 
work  of  the  author.  Nevertheless,  we  have  tried  to  give  up  to  date  references 
to  related,  and  particularly  computational  work.  These  references,  as  well 
as  the  other  articles  in  the  present  volume  should  be  consulted  for  different 
viewpoints  of  high  order  methods. 

1.2  Notation 

We  list  some  notation  which  will  be  used  throughout  the  text.  We  will  denote 
the  physical  domain  in  which  the  computations  will  be  performed  by  J?  C  IRd 
where  the  dimension  d  =  1,2,3  (d  =  1  will  rarely  be  considered).  Partial 
derivatives  with  respect  to  the  spatial  variables  Xi  will  be  denoted  by  di  and 
will  be  understood  in  the  distributional  sense,  unless  stated  otherwise.  The 
usual  differential  operators  V,  A,  div  etc.  shall  be  used  and  summation  over 
repeated  indices  is  employed.  By  L2(fl)  we  denote  the  usual  space  of  square 
integrable  functions  in  (1.  By  Hk(fi),  k  >  0,  we  denote  the  Sobolev  space 
of  functions  with  fcth  square  integrable  derivative  in  ft.  Evidently,  H°  = 
L2.  By  (•,  -)a  we  denote  the  L2  innerproduct  over  the  set  ft,  i.e.  (u,  v)n  — 
fn  uvdx.  L2{fi)  is  a  Hilbert  space  with  inner  product  (u,  v)n  and  norm  ||uj|  := 

((it, u)a)^2.  Analogously,  Hk(fi)  is  equipped  with  the  innerproduct 

Mk,a=  E  (D°u’Dav)n 

\a\ <k 

where  a  €  INq  is  a  multiindex  and  Da  is  the  derivative  of  order  a.  The  norm 
IMIfc  n  defined  analogously  as  in  the  L2  case: 

/  \l/2 

Nlfc.tt  = 

Similarly,  we  define  Sobolev  spaces  and  norms  on  lower  dimensional  sets,  such 
as  e.g.  the  boundary  F  =  dfi.  The  L2(r)  inner  product  is  just  the  (Lebesgue) 
surface  integral  taken  with  respect  to  the  surface  measure  on  r  and  we  write 
{u,v)r  =  fruvds.  Spaces  of  vector  valued  functions  will  be  denoted  by  a 
superscript  after  the  space,  i.e. 

Hk{Q)m  =  [Hk{fi)]m 

denotes  the  m-fold  tensor  product  of  the  space  Hk{Q).  Typically,  m  will 
denote  the  number  of  state  variables  in  the  system  under  consideration. 

Throughout,  the  spectral  order  of  the  elements  will  be  denoted  by  the 
letter  k,  elements  by  K  and  partitions  of  fi  into  d-dimensional  elements  by 
T.  The  letter  £  denotes  the  set  of  d  —  1  dimensional,  intersections  of  elements 
K,  K'  €  r. 
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1.3  Governing  Equations 

Continuum  mechanics  of  a  compressible  fluid  in  a  domain  Cl  C  IRd  is  described 
by  the  mass  density  p  :  Cl  ->  IR,  the  velocity  field  u  :  Cl  — t  IRd  and  the  energy 
e  :  Cl  -4  IR.  These  fields  are  governed  by  the  (compressible)  Navier-Stokes 
equations  (NSE)  which  read  (in  Eulerian  form) 

Conservation  of  Mass 


dp 

dt 


(1.1) 


Conservation  of  Momentum 
d 


-Qt(Pui)  +  J2  faT.i pUiUj+pSij )  =  J2  JCCTij  +  Si 


.  1  dxj 
3=1  J 


3=1  J 


(1.2) 


Conservation  of  Energy 
d  g 


for  i  =  1,. .  .,d  . 


d  d  /,  dT  d 


+  E  ^«pe+p)Uj)  =  E&-  (‘sr  +  E’-i'”-) .  Ci-3) 

j= 1  J  j=l  J  J  1=1 


where  k  >  0  denotes  thermal  diffusivity,  r  is  the  stress  tensor  describing  the 
elastic  effects  in  the  fluid,  e  is  the  internal  energy  and  S  G  L2{Cl)d  are  given 
sources. 


Part  I 

Fundamentals  of  /ip-FEM 

2  Model  Problems 

We  present  several  scalar  model  problems  modeling  diffusive  transport  of  a 
scalar  quantity  u  which  share  many  features  which  we  will  encounter  later 
on  also  in  the  context  of  the  Navier-Stokes  equations. 

Consider  linear,  diffusive  transport  of  a  scalar  field  u(x,t)  in  Cl  x  (0,T) 
where  Cl  C  IRd  is  a  bounded  domain  with  piecewise  smooth  boundary  r  = 
dfl;  it  is  governed  by  the  equation 


—  +  div  f  (u)  +  au  =  div  q (Vu)  +  5  in  Cl  x  (0,T) . 


(2.1) 
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Here  f  and  q  are  the  convective  resp.  diffusive  fluxes,  and  a  >  0  the  reaction 
constraint.  We  consider  here  the  linear  fluxes 

f(u)  =  flu  ,  (2.2) 

where  /3  £  L°°(fi)d  is  the  flux  vector  and 

q(Vtt)  =  A  Vu  (2.3) 

where  A  £  L°°{Q) is  a  positive,  possibly  anisotropic  diffusivity  matrix 
satisfying 

e  Ci  <  £T  A(x)£  <  ce  e  Rd,  a.e.  xefl  (2.4) 

for  some  e  >  0. 

Of  particular  interest  is  the  case  q  =  el,  whence  q(Yu)  =  e  Vtt  and  (2.1) 
becomes  with  (2.2),  (2.3) 

Ou 

—  +  div  ( flu )  +  au  =  eAu  +  S  in  1?  x  (0,T)  .  (2.5) 

Here  S  £  L2(fi)  is  a  source  term  which  we  assume  time-independent  unless 
stated  otherwise. 

(2.5)  is  completed  by  initial-  and  boundary  conditions.  To  this  end,  par¬ 
tition  r  into  2  disjoint  parts, 

r  =  Td  U  .Tjv,  i"b  0  -T/v  =  0 . 

Then  we  impose  initial  and  boundary  conditions 

u  =  f  on  To, 

q(Vu)  ■  n  =  g  on  TN  ,  (2.6) 

u(-,  0)  =  uo  at  t  =  0 , 

Here  n  is  the  exterior  unit  normal  vector  to  T.  In  the  following,  we  will 
discuss  the  hp-FE  discretization  of  various  special  cases  of  (2.5).  Since  many 
schemes  are  based  on  separate  treatment  of  space  and  time  variables,  it  is 
useful  to  consider  first  semidiscretization  of  (2.5)  in  space.  These  spatial  hp- 
discretizations  can  be  introduced  for  the  steady  state  case,  i.e.  for  =  0 
and  this  is  what  we  will  do  in  the  sequel.  We  consider  special  cases  of  (2.5), 
in  particular  the  reaction-diffusion  and  the  pure  advection  problem. 

While  doing  so,  we  will  pay  particular  attention  to  the  singular  pertur¬ 
bation  character  and  the  variational  formulation  of  the  problem  -  we  review 
classical,  mixed,  stabilized  and  the  discontinuous  Galerkin  (DG)  formula¬ 
tions.  Especially  the  latter  ones  are  being  used  with  increasing  frequency  in 
FE  flow  simulations  (see,  e.g.,  [12,13,18-20,23,26,36,44,49,64-67]  but  must  be 
complemented  by  a  suitable  time  stepping  scheme.  This  will  be  topic  of  the 
second  part  of  these  notes,  however. 

The  preferable  type  of  discretization  depends  strongly  on  the  dominant 
terms  in  (2.1).  We  will  address  several  particular  cases: 
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2.1  Reaction-Diffusion 

Here  A  =  1,  (3  =  0  and  a  =  1  so  that  (2.5),  (2.6)  become 

-eAu  +  u  =  S  in  Q,  (2.7) 

u  =  f  on  Td,  e  —  =  g  on  rN.  (2.8) 


2.2  Convection 

We  assume  that  in  (2.5)  (3  €  C'1(J7)d  and  that  e  =  0.  Then  (2.5)  becomes,  in 
the  steady  state  case, 


/3  •  Vu  +  (a  +  div  /3)u  =  5  in  Q.  (2.9) 

This  equation  is  now  first  order  hyperbolic  in  space  and  the  boundary  con¬ 
ditions  (2.6)  cannot  be  imposed  anymore.  It  is  a  model  for  the  continuity 
equation  (1.1).  The  correct  boundary  conditions,  for  which  the  problem  (2.9) 
is  well-posed,  are  as  follows:  Define  in-  and  outflow  boundaries 

IT.  =  {x  €  T  :  /3( x)  ■  n(a;)  <  0} ,  f+  =  {i  £  T  :  /3(x)  ■  n(ar)  >  0}  . 

and  assume  that  T  =  f 1  UT+.  The  “inflow”  boundary  condition  for  (2.9)  is 

u  =  f  on  (2.10) 

No  boundary  conditions  can  be  prescribed  on  the  outflow  boundary 


2.3  Convection-Diffusion 

We  observe  in  (2.9),  (2.10)  that  the  vanishing  viscosity  e  -¥  0  in  (2.5)  has 
caused  a  reduction  of  the  order  of  the  equation  and  the  loss  of  a  boundary 
condition.  This  is  a  (very  simple)  model  of  the  transition  from  (incompress¬ 
ible)  Navier-Stokes  to  (incompressible)  Euler  (which  is,  however,  not  very 
well  understood  at  present).  The  steady  state  equation  (2.5)  with  e  >  0  and 
f3  ^  0  is  the  convection-diffusion  equation 

/3  ■  Vu  +  (a  +  div  fi)u  =  eAu  +  S  in  fi,  (2-11) 

together  with  the  boundary  conditions  (2.6). 
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3  Solution  properties 

Any  stable  numerical  scheme  for  the  numerical  solution  of  (2.1)  -  (2.11) 
will  generate  solutions  un  which  approximate  u  -  for  the  design  of  efficient 
schemes  it  is  therefore  necessary  to  know  certain  qualitative  features  of  the  so¬ 
lutions  to  be  approximated.  hp-FEM  allow  for  simultaneous  mesh-refinement 
and  variation  of  the  polynomial  degree  and  constitute  a  generalization  of 
both,  the  standard  low  order  finite-volume  /  finite-element  methods  as  well 
of  the  so-called  spectral  methods.  The  large  flexibility  in  hp-FEM  is  most 
easily  used  with  unstructured,  triangular  resp.  tetrahedral  meshes,  and  high 
polynomial  degree  which  is  best  suited  for  irregular  flows  with  moving  fea¬ 
tures  as  e.g.  the  vortex  shedding  in  incompressible  flow  in  the  wake  of  a 
cylinder.  Nevertheless,  substantial  improvements  in  accuracy  vs.  degrees  of 
freedom  (and,  in  particular,  exponential  convergence)  can  be  realized  by  using 
structured  meshes  in  certain  subregions  of  the  flow. 

In  the  following,  some  typical  solution  features  are  presented. 


3.1  Corner  Singularities 

Corner  singularities  are  present  in  2-dimensional  domains  whenever 

a)  the  governing  equations  contain  viscosity  (i.e.  diffusion  or  elasticity),  [48], 
and  the  boundary  of  the  domain  is  not  smooth  at  a  point  O  £  dfl  (even 
changes  in  curvature  which  may  not  be  apparent  at  first  sight  excite 
corner  singularities),  or 

b)  when  inside  a  smooth  boundary  segment  the  boundary  conditions  change 
abruptly,  (e.g.  Py  in  Figure  3.1).  In  three  dimensional  domains,  for  ex¬ 
ample  in  polyhedra,  corner  singularities  arise  at  vertices  -  in  addition,  at 
edges  so-called  edge-singularities  appear  which  we  discuss  below. 

Corner  singularities  are  solution  components  with  low  regularity  which  are 
poorly  approximated  by  low  order  methods  on  uniform  meshes.  In  the  context 
of  convection-dominated  problems,  the  resulting  large  approximation  error 
at  the  corner  is  transported  downstream  and  maybe  responsible  for  spurious 
solution  features. 

We  discuss  corner  singularities  in  2-dimensions.  Let  12  C  IR2  be  a  polygon 
with  M  possibly  curved  sides  Pj,  cf.  Figure  3.1,  and  vertices  Pj,  j  =  1, . . . ,  M. 

Consider  the  reaction-diffusion  Problem  (2.7)  with  e  =  1  in  tt  for  smooth 
source  terms  S,  f  and  g.  We  assume  that  F  £>  fl  Dyy  coincide  with  vertices 
Pj,  i.e.  each  F)  is  contained  in  either  Fd  or  in  F/v- 

If  the  source  terms  are  smooth,  the  solution  u  of  (2.7),  (2.8)  is  also  smooth 
inside  i?,  but  not  at  the  vertices  Pj.  More  precisely,  for  any  s  >  0  the  solu¬ 
tion  u  can  be  decomposed  into  a  smooth  part  ureg  £  Hs+2{fl)  and  singular 
functions  S(rj,<pj ): 
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Fig.  3.1.  Polygon  Q  with  vertices  Pj 


M  K(s)  L 

U  =  Ureg  +  E  xiXj)  EE  aki  Sjia{rj,  (fij)  (3-1) 

where  J_1  k~1  e 

ureg  6  Hs+2{fl), 

X(r)  >  0  is  a  smooth  cut-off  function, 

X  =  1  near  zero, 

Sjkdrj ,  <fij)  =  rfk  (log  rjY  $jke(<Pj)  (3-2) 

(r j ,  ipj )  Polar  coordinates  at  Pj, 

Xj(  >  0  the  singularity  exponent, 

$je(<p)  a  smooth  function  of  ip. 

Notice  the  dependence  of  K  in  (3.1)  on  s  -  the  smoother  uieg  is  supposed  to  be, 
the  larger  is  s  and  the  more  terms  have  to  be  included  into  the  decomposition. 
Decomposition  (3.1)  is  by  now  classical  in  the  theory  of  elliptic  equations  - 
we  mention  here  only  [42]  and  the  references  there.  It  is  important  to  note 
that  the  Sjkt  and  the  A j  do  not  depend  on  S,  f  and  on  g  in  (2.7).  They  only 
depend  on  the  interior  angle  of  Q  at  the  vertex  Pj ,  the  boundary  conditions 
and  on  the  diffusion  operator.  Analogous  results  hold  for  solutions  of  (2.11) 
with  e  =  1,  since  there  once  again  the  diffusion  part  of  the  operator  is  equal 
to  -Au.  The  same  result  holds  also  for  systems,  such  as  for  the  Stokes-system 
or  the  system  of  linearized  elasticity  arising  in  viscous,  compressible  flow. 
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3.2  Boundary  layers 

Other  interesting  phenomena  happen  when  e  -»  0  in  (2.7),  (2.11).  We  see 
that  formally,  at  e  =  0,  the  order  of  the  equation  changes:  (2.11)  becomes  the 
first  order  hyperbolic  problem  (2.9)  and  (2.7)  the  “zeroth”  order  problem 

u  =  S  in  fl  .  (3.3) 

Evidently,  for  general  source  terms  5,  this  u  will  not  satisfy  the  boundary 
conditions  (2.8)  anymore  -  this  (whole  or  partial)  loss  of  boundary  conditions 
is  typical  when  the  viscosity  e  in  the  system  vanishes.  As  e  -»  0,  the  solution 
ue  of  (2.7)  forms  steep  gradients  near  dft.  The  simplest  one-dimensional 
problem  exhibiting  these  effects  is 

—eu"+u  =  1  in  (—1,1),  u(±  1)  =  0  .  (3.4) 

We  have  the  exact  solution 

!  _  exp(— (1  +  aQ/y/e) _ exp(-(l  -  ap/yE) 

exp(l/v/e)  +  exp(-l/v/e)  exp(l/v/e)  +  exp(-l/-v/e) 

which  is  equal  to  a  regular  part,  ureg,  i.e.  S  =  1,  up  to  two  terms  that 
are  exponentially  decaying  off  dO,  the  so-called  (viscous)  boundary  layers: 
i.e.  the  decomposition  ue  —  ureg  +  uu .  For  linear  problems  with  constant 
coefficients,  viscous  boundary  layers  are  always  exponential.  If  the  coefficients 
are  nonconstant  or  the  problem  is  nonlinear,  generally  no  explicit  form  of  the 
layers  is  known.  For  nonconstant,  analytic  coefficients  one  can  show,  however, 
that  boundary  layers  with  length  scale  d  satisfy  for  every  n  the  estimates  (see, 
e.g.,  [45]  for  a  proof  in  the  linear,  variable  coefficient  case) 

\Dnu\t{x)\  <  CKnmax{n,  l/e}"exp(-6p(a;)/e)  (3.6) 

where  p(x )  =  dist(a;,  d(2)  is  the  distance  to  the  boundary  and  the  positive 
constants  b,  C,  K  are  independent  of  n  and  d.  Evidently,  the  solution  (3.5) 
satisfies  (3.6). 

In  two  dimensions,  ifdO  is  smooth,  an  analogous  result  holds:  the  solution 
u£  can  be  decomposed  into  a  regular  part  ureg(x)  (whose  derivatives  remain 
bounded  as  e  — >  0)  and  boundary  layers  uu  (whose  derivatives  behave  like 

.  \Dauu\  ~  0(e~H) 

as  £  ->  0) . 

Generally,  boundary  layers  of  (2.7),  (2.11)  are  special  solutions  of  (2.7) 
resp.  (2.11)  with  5  =  0,  but  with  nonzero  data  f,g  of  the  forms 

ubl  =  U(p/d(e))  $(6)  (3.7) 

where  (p,  6)  are  boundary  fitted  coordinates  near  dQ  (see  Figure  3.2). 
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Fig.  3.2.  Boundary  fitted  coordinates  ( p,6 )  in  Q 


In  two  dimensions,  0  <  s  <  L  is  the  arclength  of  dfl  and  s  is  the  normal 
distance  of  a  point  P  =  ( x ,  y)  to  dQ. 

The  function  [/(•)  in  (3.7)  is  independent  of  e  and  decaying  for  positive 
arguments  -  it  is  the  so-called  boundary  layer  profile.  In  all  linear  problems, 
in  particular  in  (2.7),  (2.11),  the  boundary  layer  profile  is  exponential,  i.e. 

U( C)  =  exp(— C),  C>0.  (3.8) 

The  function  (P(s)  in  (3.7)  is  smooth  independent  of  e  and  d(e),  the  so-called 
length-scale  of  the  layer,  is  usually  some  simple  power  of  e  -  in  (2.7),  it  is 
d(e)  =  y/e,  whereas  in  (2.11)  d(e)  =  e  or  d(e)  =  y/e,  depending  on  whether 
the  boundary  is  characteristic  or  not.  In  nonlinear  problems,  not  much  is 
known  about  decompositions 


U  —  Ure g  “t“  U\,i  . 

For  the  incompressible  Navier-Stokes  equations,  the  profile  U(Q  in  (3.8)  is 
the  similarity  solution  of  a  nonlinear  ODE  which  again  is  decaying  as  (  tends 
towards  infinity  and  the  length  scale  is  d(e)  =  Re _1/2. 


3.3  Viscous  Shock  Profiles 

One  dimensional  case.  Consider  the  scalar  conservation  law  with  viscous 
perturbation  in  one  dimension 

ut  +  f(u)x  =  euxx  ( x ,  t)  G  IR  x  IR+  (3.9) 


with  initial  condition 


u(x,0)  =  uq{x)  . 


(3.10) 
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For  e  =  0,  u(x,  t )  in  general  develops  discontinuities  in  finite  time.  For  e  >  0, 
these  shocks  are  smeared  out  -  we  have  a  viscous  shock  profile. 

Assuming  that  (3.9)  admits  a  steady  asymptotic  solution  as  t  ->  oo,  this 
solution  must  satisfy 


f(u)x  =  euxx  in  x  e  IR 
lim  u(x)  = 

X— >±  OO 


(3.11) 

(3.12) 


where  u±  are  the  left/right  states  of  the  shock.  We  see  from  (3.11)  that  u(x), 
if  it  exists,  must  have  the  form 


u(x)  =  U((x  -  Xs)/s) 


(3.13) 


where  xs  is  the  shock-location  and  U{- )  is  the  viscous  shock-profile.  [/(£) 
satisfies  the  ordinary  differential  equation 

Ua  =  f(U)i  £  €  (-oo,oo)  (3.14) 

with  the  boundary  conditions 

lim  U(0  =  u±  .  (3.15) 

£-»±oo 

Assuming  a  solution  U  of  (3.14)  exists,  this  solution  will  be  locally  analytic  if 
the  flux  /(•)  is  analytic.  Moreover,  in  many  cases  we  have  exponential  decay 
of  £/(£)  to  u*: 

|lf(±0-»:fc|  <Cexp(-b£),  £-4oo.  (3.16) 

See  [68],  Chapter  24,  for  more  on  this. 

Consider  the  viscous  Burgers’  equation  (3.9)  where  f(u)  =  u2/2.  Here  the 
viscous  shock  profile  developing  for  initial  data 


u0(z)  = 


a  x  <  0 
-ai>0 


with  a  >  0  has  the  form  (3.13)  with  xs  =  0  and 

U(0  =  —a  tanh(a£)  (3.17) 

for  some  a  >  0  independent  of  e.  (3.17)  evidently  satisfies  (3.16)  with  u^1  = 
^  a.  We  stipulate  therefore  that  viscous  shock  profiles  are  internal 
layers  originating  in  the  shock-location  xs.  The  viscous  shock  profiles 
can  be  seen  as  boundary  layers  at  the  (generally  unknown)  free  boundary  xs. 
The  viscous  shock  profile  u(x)  in  (3.13)  is  assumed  to  satisfy  an  estimate  of 
the  form  (3.6),  i.e.  there  are  b,C,K  >0  such  that 

| Dn  u(x)|  <  CKn(max(n,  1/e))"  exp(— bp(\x  —  a:s|)/e)  (3.18) 

for  n  —  1,2,...  and  x  ^  xs.  The  solution  u(x)  with  V  as  in  the  example 
(3.17)  is  seen  to  satisfy  condition  (3.18). 
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Higher  dimensional  case.  In  dimension  d  >  1,  shocks  are  discontinuities 
in  solutions  across  possibly  curved  discontinuity  surfaces  E,  which  arise  in 
nonlinear,  hyperbolic  equations.  If  viscosity  is  present,  the  discontinuities  will 
be  replaced  once  more  by  a  viscous  shock  profile  which  we  assume  to  have  the 
following  generic  form:  denoting  by  ( s ,  p)  coordinates  fitted  to  E  (see  Figure 
3.3)  the  viscous  shock  profile  is  of  the  form 

«sh (s,  p)  =  C(s)  uH{\p\)  (3.19) 

where  uu(p)  is  a  boundary  layer  function  satisfying  the  estimate  (3.6)  with 
length  scale  equal  to  the  viscosity  parameter  s  and  C(s)  is  smooth  (analytic) 
independent  of  e,  i.e.  || Dls  C||l~  <cKlt\  for  all  £,  where  c,  K  are  independent 
of  £. 


x 


Fig.  3.3.  Discontinuity  surface  E  and  fitted  coordinates  (s,  p) 


We  emphasize  that  the  behavior  (3.19)  for  viscous  shock  profiles  is  ex¬ 
trapolated  from  1-d,  there  are,  to  date,  no  rigorous  regularity  results  in  the 
nonlinear  setting  for  the  solution  ush(s,p)  -  in  particular,  the  regularity  at 
shock  -  boundary  and  shock  -  shock  interaction  points  in  the  presence  of  vis¬ 
cosity  is  open.  We  take  here  the  point  of  view  that  viscous  shock  profiles  and 
boundary  layers  are  closely  related  and  that,  likewise,  the  corner  singulari¬ 
ties  and  the  shock-boundary  interaction  are  of  similar  nature  in  that  one  has 
low  regularity  in  an  0(e)  neighborhood  of  the  interaction  point  of  a  globally 
relatively  smooth  (piecewise  analytic)  solution  u. 

4  Basic  hp  FEM 

We  describe  the  main  components  of  ftp-FEM,  beginning  with  the  admissible 
meshes  followed  by  the  function  spaces  on  these  meshes.  We  make  provisions 
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for  unstructured  as  well  as  for  patchwise  structured  meshes  since  these  are 
very  advantageous  for  the  resolution  of  specific  flow  phenomena,  if  they  are 
combined  with  proper  distribution  of  the  polynomial  degrees.  Many  of  the 
meshes  used  in  present  day  CFD  exhibit  some  structure,  such  as  refinement 
towards  the  surfaces  and  uniform  refinement  at  the  trailing  edge  and  corners. 
We  will  explain  how  proper  design  of  meshes  and  polynomial  degree  distri¬ 
bution  in  the  hp-FEM  gives  exponential  convergence  for  the  solution  features 
of  the  previous  section. 


4.1  hp- FE  Spaces  on  patchwise  structured  grid 

Meshes.  Let  V  denote  a  partition  of  Q  into  open  patches  P  which  are  images 
of  a  reference  patch  P  under  smooth,  bijective  maps  Fp : 

VP£V:  P  =  Fp(P)  . 

We  assume  that  P  is  either  the  unit  cube 


P  =  Q:=  (-l,l)d 

or  the  unit  simplex  d 

P  =  S  :=  G  IRd  :  Xi  >  0,  ^  <  1  j  . 

i=l 

The  meshes  T  are  unions  of  patch  meshes  Tp  which  are  constructed  in  the 
reference  patch  P  and  transported  to  P  €  V  via  the  patch  map  Fp.  For  each 
P,  a  patch  mesh  Tp  is  obtained  by  first  subdividing  PJnto  Mangles  resp. 
quadrilaterals  K  which  are  affine  equivalent  to  either  Q  or  S]  we  call  this 
mesh  Tp.  A  mesh  Tp  in  P  €  V  is  then  obtained  by  simply  mapping  Tp  to  P 
using  the  patch  map  Fp 

VPeP:  Tp  :=  {K I K  =  FP{K),  K  e  fP}  .  (4.1) 

The  mesh  T  in  ft  is  the  collection  of  all  patch  meshes,  i.e. 


r=  U  7>- 

psp 

Note  that  each  element  K  €  T  is  an  image  of  the  reference  domain  P  via  the 
element  map  Fk  '■  if  K  &  P  for  some  P  G  V, 

K  =  Fk(P),  Fk:=FpoAr  (4.2) 

where  AR  :  P  K  £  P  is  affine. 

We  emphasize  that  we  could  choose  AR  =  id  and  Tp  =  {-P},  thereby 
obtaining  the  usual  parametric  elements  and  arbitrary,  unstructured  meshes. 
However,  it  is  advantageous  in  hp-FEM  to  use  structured  patch  meshes  Tp 
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as  e.g.  geometric  corner  refinement,  anisotropic  boundary  layer  and  edge 
refinement  etc.  In  what  follows,  the  partition  V  and  the  patch  maps  Fp  = 
{Fp  :  P  £  V}  shall  be  fixed,  i.e.  mesh  refinement  is  performed  in  P. 

We  call  the  mesh  T  regular,  if  for  any  two  K,K'  £  T  the  intersection 
K  C\  K  is  either  empty  or  an  entire  side  (more  precisely,  an  entire  boundary 
segment  of  dimension  0  <  d1  <  d  as  e.g.  a  vertex  {d!  —  0),  an  entire  edge 
(■ d '  =  1),  an  entire  side  (<f  =  2)  etc.).  In  order  for  the  mesh  T  to  be  regular, 
the  maps  Fp  must  be  compatible  between  patches  in  the  sense  that 


if  P  D  P'  £  0  :  FP  o  (Fp,)-1  |pnP7  =  id  on  P  n  P'  . 


(4.3) 


The  Tp  are  1-irregular,  if  they  consist  of  quadratics  resp.  hexagonal  elements 
with  at  most  one  irregular  (“hanging”)  node  per  side.  T  is  1-irregular,  if  the 
Tp  CT  are  either  regular  or  1-irregular  and  compatible  between  patches. 

Polynomial  subspaces.  On  the  reference  element  P  we  define  spaces  of 
polynomials  of  degree  p  >  0  as  follows: 


Qk  =  span{i“  :  0  <  Oj  <  k,  1  <  i  <  d} 

d 

Vk  =  span{ia  :  0  <  at,  0  <  ^  on  <  k  . 


(4.4) 


*= l 


Polynomial  subspaces  on  Tp.  Let  Tp  be  any  mesh  consisting  of  patch 
meshes  Tp  and  let 

k  =  {kK:K£T} 

be  a  polynomial  degree  vector  on  T.  The  definition  of  a  discontinuous 
hp- FE  space  is  now  straightforward:  if  Fp  =  {Fp  :  P  e  V)  denotes  the 
patch-map  vector,  we  set 


Sk'°(n,T,Fv):={ueL2(n):  u\k  °  Fk  £  QkK  XKeT 

is  quadrilateral  resp.  u\k  °  Fk  £  VkK  if  K  is  triangular)  . 


(4.5) 


No  interelement  continuity  is  imposed  here.  If  the  polynomial  degree  is  uni¬ 
form,  kx  =  k  for  all  K  £  T,  we  write  Sk,0(Q,T,Fp).  If  T  and  Fp  are  clear 
from  the  context,  we  omit  them  and  write  Sk'°(f2). 

Let  us  now  turn  to  continuous  fip-FE  spaces.  Here  we  assume  T  to 
be  either  regular  or  1-irregular.  If  the  polynomial  degrees  kp  are  uniform, 
kp  =  k  for  all  K,  we  define  for  k  >  1 


S*’1  (17,  T,Fp)  =  Sk’°(f2,  T,  Fp)  n  H1  (17), 


(4.6) 


i.e.  interelement  continuity  is  now  enforced  and  the  compatibility  (4.3)  be¬ 
tween  patches  is  required.  If  the  polynomial  degrees  are  nonuniform,  there 
are  several  ways  to  enforce  interelement  continuity  -  assume  that  K,K‘  €  T 


ftp-FEM  for  Fluid  Flow  Simulation  341 


share  a  d—1  dimensional  set,  and  that  px  <  Px1-  One  can  now  either  enrich 
the  polynomials  on  K  or  constrain  the  polynomials  on  K1.  We  adopt  with 
(4.6)  the  latter  approach. 

Note  that  one  could  even  allow  anisotropic/nonuniform  polynomial  de¬ 
grees  within  an  element  K  £  T  -  this  becomes  important  when  adaptivity 
is  considered  (see  [21]  and  the  references  there).  Definition  (4.6)  implies  that 
DOFs  from  K'  that  are  unmatched  by  those  from  K  are  constrained  to  zero 
on  interfaces  K  OK1. 


Basic  hp- FE  Spaces.  We  introduce  the  hp- FE  subspaces  Sk^(/2,T, F-p), 
£  =  0,1,  which  are  basic  to  the  hp-FEM;  £  =  0  will  denote  discontinuous 
functions  whereas  £  =  1  implies  i/1(/2)  conformity,  i.e.  full  continuity.  These 
are  the  basic  and  most  frequently  used  hp-spaces. 

4.2  Choice  of  Patch  Meshes  7p  in  2-d 

Preliminaries.  A  mesh  T  on  a  bounded  polygonal  patch  P  C  IR2  is  a 
partition  of  Q  into  disjoint  and  open  quadrilateral  and/or  triangular  elements 
{ K }  such  that  P  =  U K€tK-  The  mesh  _T  is  called  regular  if  for  any  two 
elements  K,K'  £  T  the  intersection  K  fi  K'  is  either  empty,  a  single  vertex 
or  an  entire  side.  Otherwise,  the  mesh  T  is  called  irregular.  We  denote  by  hx 
the  diameter  of  the  element  K  and  by  px  the  diameter  of  the  largest  circle 
inscribed  into  K.  The  meshwidth  h  of  T  is  given  by  h  ~  maxxeT^x-  The 
fraction  ax  :=  ^  is  the  aspect  ratio  of  the  cell  K.  A  (regular  or  irregular) 
mesh  T  is  called  n-shape  regular  if  there  exists  k  >  0  such  that 

max  ax  <  k  <  oo.  (4.7) 

xeT  v  ' 

T  is  called  affine  if  each  If  G  T  is  affine  equivalent  to  a  reference  element  P 
which  is  either  the  square  Q  =  (0,  l)2  or  the  triangle  T  =  {(x,  y):  0  <  x  <  1, 
0  <  y  <  x},  i.e. 

K  =  Ax(K),  Ax(-)  affine. 


Reference  meshes.  We  introduce  now  some  meshes  on  the  reference  ele¬ 
ments. 

Definition  4.1.  Let  n  €  IN0  and  a  £  (0, 1).  On  Q,  the  (irregular)  geometric 
mesh  Alt)„  with  n +  1  layers  and  grading  factor  a  is  created  recursively  as 
follows:  If  n  =  0,  =  {Q}.  Given  An%a  for  n  >  0,  An+ ijCT  is  generated  by 

subdividing  that  square  K  £  An with  0  £  K  into  four  smaller  rectangles  by 
dividing  the  sides  of  K  in  a  a  :  (1  —  a)  ratio.  The  (regular)  geometric  mesh 
An<a  is  obtained  from  d„i<r  by  removing  the  hanging  nodes  as  indicated  in 
Figure  4.1. 
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In  Figure  4.1  the  geometric  meshes  are  shown  for  n  =  3  and  a  =  0.5. 
Clearly,  An>a  is  an  irregular  affine  mesh,  it  has  so-called  hanging  nodes  while 
An,„  is  regular.  The  elements  of  the  geometric  mesh  Ana  are  numbered  as 
in  Figure  4.1,  i.e. 

An,*  =  {flu}  U  {fhj  :  1  <  i  <  3,2  <  j  <  n  +  1).  (4.8) 

The  elements  f?2j  and  O3 j  constitute  the  layer  j. 


Fig.  4.1.  The  geometric  meshes  An,cr  and  An<(r  with  n  =  3  and  a  =  0.5. 


Remark  4-2.  On  the  reference  triangle  T,  Ari)(T  and  AUiCT  can  be  defined  in  a 
similar  way.  An>a-  is  depicted  in  Figure  4.3. 

Definition  4.3.  Let  Tx  be  an  arbitrary  mesh  on  /  =  (0, 1),  given  by  a  parti¬ 
tion  of  I  into  subintervals  {Kx}.  On  Q,  the  boundary  layer  mesh  A%  is  the 
product  mesh 

ATm  =  {K  :  K  =  Kx  x  I,KX  €  %}  . 

Figure  4.2  shows  a  typical  boundary  layer  mesh.  We  emphasize  that  any  Tx 
is  allowed,  in  particular,  rectangles  of  arbitrary  high  aspect  ratio  can  be  used 
such  that  boundary  layer  meshes  are  not  K-uniform. 

Definition  4.4.  Let  n  €  IN0  and  a  €  (0, 1).  Let  Tn,a  be  the  one  dimensional 
geometric  mesh  refined  towards  0  given  by  a  partition  of  I  =  (0, 1)  into 
subintervals  where 

Ij  =  (xj-i,ij)  with  £0  =  0  and 
Xj  =  crn+1~j,  j  =  1, . . .  ,  n  +  1 . 

On  Q,  the  geometric  tensor  product  mesh  A2n  cr  is  then  given  by  Tn,a  0  Tn,a, 
i.e. 

An,cr  =  { Jj  X  ?k  •  Ij  €  7n,<7!  Ik  €  Tin,*}  • 
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$2  X2 


Fig.  4.2.  Boundary  layer  mesh  and  geometric  tensor  product  mesh  on 


Q- 


The  tensor  product  mesh  A\a  contains  anisotropic  rectangles  with  arbitrary 
large  aspect  ratio.  For  the  proof  of  the  inf-sup  conditions  ahead,  it  is  impor¬ 
tant  that  A2n  a  can  be  understood  as  the  geometric  mesh  An<a  into  which 
appropriately  scaled  versions  of  boundary  layer  meshes  A- px  are  inserted  to 
remove  the  hanging  nodes.  A  geometric  tensor  product  mesh  is  shown  in 
Figure  4.2  with  n  =  5  and  a  —  0.5.  The  underlying  geometric  mesh  Ar^„  is 
indicated  by  bold  lines. 

Remark  4-5.  As  before,  A2n  a  can  also  be  defined  on  the  reference  triangle  T. 
This  is  shown  in  Figure  4.3.  On  the  reference  square  Q  we  can  even  admit 
mixtures  of  geometric  tensor  product  meshes  and  geometric  meshes  Z\„jCr  or 
An>(r  as  illustrated  in  Figure  4.4.  They  are  denoted  by  A™a  and  A™a.  Of 
course,  other  mixtures  are  imaginable. 


X2  X2 


Fig.  4.3.  The  meshes  A ^  and  A2la  on  the  reference  triangle  T. 

Admissible  patch  meshes  T. 

Definition  4.6.  An  affine  mesh  T  on  P  is  called  a  (F ,Tm,  n)-mesh  if 
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X2 


X2 


Fig.  4.4.  The  meshes  A™a  and  with  n  =  5  and  a  =  0.5. 


1.  Tm  is  an  affine  mesh  which  is  coarser  than  T  and  K-uniform  for  some 
k  >  0.  The  elements  of  Tm  are  called  macro- elements  and  Tm  is  the 
macro- element  mesh  of  T. 

2.  T  is  a  nonempty  family  of  affine  reference  meshes  on  the  reference  square 
Q  or  the  reference  triangle  T. 

3.  The  restriction  Tk  :=  T\k  of  T  to  any  macro-element  K  £  Tm  is  given 
by  Tk  —  FK(f)  for  some  T  in  T  where  Fk  is  the  affine  mapping  between 
K  and  K. 

A  {F,Tm,  ft)-mesh  is  thus  obtained  from  the  rc-uniform  mesh  Tm  by  refining 
some  or  all  elements  with  the  strategies  given  by  the  family  T.  In  the  simple 
case  where 

the  notion  “(T,  Tm,  K,)-mesh”  reduces  to  the  already  introduced  notion  of  k- 
uniform  affine  meshes  consisting  of  quadrilaterals  and/or  triangles  and  the 
notion  of  “macro-elements”  becomes  unnecessary.  We  are  mainly  interested 
in  the  family 


Fa  —  An!(r,  Aj-x,  An(r,  (4  9) 

{Q},  {f }  :  n  e  IN0,  %  arbitrary} 

for  a  £  (0, 1)  fixed.  Here,  An>a  and  A2n<(J  is  understood  as  a  mesh  on  Q  or  T. 
Alternatively,  one  could  consider  An,a  as  a  part  of  the  macro-element  mesh 
Tm  and  put  only  the  irregular  patches  into  the  family  Ta .  If  T  contains  no 
triangles,  Ta  can  be  reduced  to 

T°  =  { An,CT,  A% ,  A™,,,  {Q}  :  n  £  IN0,  %  arbitrary}  (4.10) 


/ip-FEM  for  Fluid  Flow  Simulation  345 


where  An><T  and  A2ne  have  now  to  be  meshes  on  Q.  We  call  a  {Ta ,  Tm ,  K)-mesh 
shortly  (k,  a)  -mesh  where  we  choose  the  reduced  family  Ta  if  T  contains  no 
triangles. 

(k,  cr)-meshes  are  a  quite  general  class  of  possibly  highly  irregular  meshes. 
They  are  well  suited  for  the  effective  resolution  of  boundary  layer  and  corner 
singularity  phenomena.  Typically,  mesh-patches  from  Tm  near  the  boundary 
of  the  domain  are  partitioned  anisotropically  using  Arx  -meshes  to  approxi¬ 
mate  boundary  layers.  Patches  near  corners  are  geometrically  refined  towards 
the  corner  with  the  meshes  An>a  or  A2^.  This  takes  into  account  boundary 
layers  as  well  as  the  singular  behaviour  of  the  solution  near  a  corner.  In  the 
interior  of  the  domain  a  simple  K-uniform  mesh  can  be  used.  Some  examples 
of  (k,  (j)-meshes  are  shown  in  Figure  4.5  and  4.6. 


Fig.  4.5.  Geometric  ( k ,  <r)-boundary  layer  meshes  near  convex  corners. 
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Fig.  4.6.  Geometric  (/t,<r)-boundary  layer  meshes  near  reentrant  corners. 


4.3  /ip-spaces  on  T 

We  introduce  the  hp- FE  spaces  investigated  later  on.  Therefore,  let  T  be 
an  affine  mesh  on  P.  With  each  element  K  £  T  we  associate  a  polynomial 
degree  kx-  All  degrees  are  stored  in  a  degree  vector 

k  =  {kfc  '■  K  6  T}  ■  (4.11) 

We  define  spaces  of  continuous  and  discontinuous  piecewise  polynomial  func¬ 
tions,  respectively,  by 


s^CP.f) 


ueH'ffiiulK  € 

’  QkK  (Q)  if  If  is  a 

,  quadrilateral  VAT  £  T 

.  'PkK  (T)  if  AT  is  a  triangle  _ 


(4.12) 
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and 


5k-°(P,f)  :=  { 


peL2(P)  :p\K  e 

[  QkK  (Q)  if  K  is  a 
/  quadrilateral 


}  VKef. 


[  VkK  (T)  if  K  is  a  triangle  J 


(4.13) 


We  set  further 

50k-1(p,f)  =  5k>1(P,f)nJ?01(P), 

S0k’°(P,f)  =  sk-0(P,f)np2(p). 

If  the  polynomial  degree  is  constant  throughout  the  mesh  T  (i.e.  kx  =  k  VP  £ 
T),  we  use  the  shorthand  notations  Sk,1(P,T)  and  Sk,0(P,T). 


5  j'tp-Error  Estimates 

We  present  /ip-error  estimates  with  particular  attention  to  the  approxima¬ 
tion  of  boundary  layers,  corner  singularities  and  viscous  shock  profiles  as 
discussed  above.  It  is  well-known  that  hp  and  spectral  methods  achieve  ex¬ 
ponential  convergence  rates  for  smooth  (analytic)  solutions  ([17,58]).  Expo¬ 
nential  convergence  of  hp- FEM  for  boundary  layers,  corner  singularities  and 
viscous  shock  profiles,  however,  requires  the  combination  of  structured  patch 
meshes  Tp  with  the  proper  polynomial  degree  distribution  k.  hp-FEM  are 
robust  in  the  sense  that  exponential  convergence  holds  even  under  certain 
changes  in  the  mesh  and  the  degree  distribution  which  we  will  indicate  in 
each  case.  This  makes  the  results  presented  here  relevant  in  practice. 

5.1  Basic  error  estimates 


One  dimensional  hp- approximation.  We  cite  some  approximation  re¬ 
sults  from  [58].  To  this  end,  we  set  I  =  (—1, 1)  and  denote  by  ||u||ft  j  resp. 

|u|fc  y  the  Hk{I )  norm  resp.  seminorm  on  I.  Denote  further  SP(I)  the  poly¬ 
nomials  of  degree  p  on  I.  Then  we  have 

Theorem  5.1.  Let  uq  G  Hk+1(I)  for  some  k  >  0.  Then,  for  every  p  >  1, 
there  exists  sq  =  ttpuo  €  SP(I)  such  that 


--'II2  <(P~S)!|unl2  - 

0,1 0,/  -  (p  +  s)!  1  o|s+i,/ 


(5.1) 


for  any  0  <  s  <  min(p,  k)2  and  such  that 


0  -*)! 


(5.2) 


2  Interpreting  the  factorials  in  terms  of  Gamma  functions  and  the  norms  as  inter¬ 
polation  norms  for  fractional  indices 
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for  any  0  <  t  <  min(p,  k).  Moreover,  we  have 

s0(±l)  =  u0(±l)  .  (5.3) 

For  the  proof,  we  refer  e.g.  to  [58].  We  emphasize  that  in  (5.1),  (5.2)  the 
dependence  of  the  error  on  the  polynomial  degree  p  as  well  as  on  the  reg¬ 
ularities  s,  t  of  u  is  completely  explicit.  Such  results  cannot  be  obtained  by 
Taylor’s  theorem  and  its  generalizations  which  are  common  in  the  analysis 
of  low  order  FEM. 

Corollary  5.2.  The  projector  irp  in  Theorem  5.1  is  bounded  as  follows: 

IIKu)'llo,j<2llu'llo,j  (5.4) 

Ikpwllo,/  <  IMIo ,/ +  ^pfr  +  1)  (5-5) 

for  all  p  >  1  and  every  u  £  H1{ I)  where  C  >  0  is  independent  of  p. 


Proof.  (5.1)  with  s  =  0  implies  (5.4)  since 

Ikllo,/  —  llS0  —  uollo,/  lluollo,f  —  2llUollo,/  • 

(5.2)  with  t  =  0  implies  (5.5)  since 


llsollo,/  ^  11*0  wo||0if  +  llwollo,/ 

-l|uolio’7+  v^(FTT)  l|u°llo-f' 


□ 


Approximation  on  quadrilaterals.  Higher  dimensional  approximation 
results  can  be  obtained  from  Theorem  5.1  by  tensor  product  construction. 
We  denote  by  7r’  uq  the  one-dimensional  projector  in  Theorem  5.1  applied  to 
uo  as  function  of  the  i  th  coordinate  alone  and  perform  the  error  analysis  for 
d  =  2. 

Let  Q  =  (— 1,  l)2  and  denote  by  %,i  =  1,2, 3, 4  the  sides  of  Q  as  shown  in 
Figure  5.1. 

Theorem  5.3.  (Reference  Element  Approximation)  Let  Q  =  ( — 1,  l)2  as  in 
Figure  5.1  and  u0  €  Hk+1  =  (Q)  for  some  k>  1.  Let  IIP  =  7r*7r%  denote  the 
tensor  product  projector.  Then  there  holds: 

7TP  uq  =  uq  at  the  vertices  of  Q  ,  (5.6) 


7Tp  V  r)  [ 


7rj;(u 0I7J  if  i  is  odd, 
7i"p (uq |-^ )  if  i  is  even. 


(5.7) 
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7a 


74 


72 


7i 


Fig.  5.1.  Q  and  the  notation  for  the  sides 


There  hold  the  error  estimates 


l|V(uo  -  *',«o)ll*e  <  2  {||8T<<S  +  l|3|+1»oll’8>  + 

*71)  HWft  «•!&/!  + lift 

P 

2  (p-  $)! 


||(u0- J7ptio)||J>§  < 


"0,Q 


0,QJ 


p(p  +  1)  (p  +  s)! 

4  (p  -  s  +  1)! 


;  (ii«;+i«»e,9+2iK+i».iy 


+ 


IlftM2 


0,Q 


p2(p  +  l)2  (p  +  s-1)! 
for  any  0  <  s  <  min(p,  k). 

Proof.  We  prove  (5.9).  It  holds 

ll«  -  npu\\ls  *  2IIU  -  ^>llo,Q  +  2H*>  -  *>Kq 

For  the  first  term  we  use  the  bound  (5.2),  resulting  in 

1  (p  —  s)! 


Ilu  —  7riu||^  ^  < 


p  0 >Q  p(p+l)  (p  +  s)! 

For  the  second  term,  (5.5)  and  (5.1),  (5.2)  give 


IW+'  »llo,«  ■ 


< 


p(p+  1)  (p  +  t)'- 

2  (p  —  r)! 
p2(p+l)2  (p  +  r)! 


||^+1u||^  *  + 


"0,Q 

Wdxd^+1u\\‘  « 


2 

0  ,Q  ‘ 


(5.8) 


(5.9) 


~  ”>K,q  <  2H«  "  ->Kq  +  lift  («  "  *»llo,<5 

2  (p-t)! 
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Selecting  t  =  s  and  r  =  s  -  1  gives  (5.9).  The  proof  of  (5.8)  is  analogous.  □ 


Approximation  on  quadrilateral  meshes  with  hanging  meshes.  Con¬ 
sider  now  a  patch  P  €  V  with  mesh  Tp  and  corresponding  reference  mesh  Tp 
in  P.  We  assume  that  all  K  E  Tp  are  quadrilateral,  possibly  with  hanging 
nodes. 


Theorem  5.4.  (Discontinuous  Approximation)  Let  P  E  V  with  quadrilat¬ 
eral,  possibly  1-irregular  mesh  Tp  of  shape-regular  elements  and  polynomial 
degree  distribution  p.  For  all  K  E  Tp  let  u\p  E  HkK+1(K)  for  some  kp  >  1 
and  define  TIu  E  Sp,0(P,Tp)  elementwise  by 


(nu)\fc  ° Fp  :=  IIPk(u\k  o Fp)  VA  £  Tp 


with  lip  as  in  Theorem  5.3. 


Then  there  holds  the  estimate  for  0  <  sp  <  min(pp,  kp) 
||it  —  IIu\\q  p  < 


CT-  \2) 


/ hp\2sx+ 2 


1 


KeTp 


Pk(Pk  + 1) 


$(pp,sp)\u'2 


sk+1,K 


where  u  =  u  o  Fp,  K  =  Fp(K)  and  where 

#(p  S)  ■=  {P~S)1  +  1  (P~S  +  1)! 

[P,  )-  (p  +  s)\  p(p+l)  (p  +  S-l)\ 


(5.10) 


>  0  <  s  <  p  . 


Further,  there  holds 

\Mu-nu)\\lP<c  Y,  (^)2SK^.^)NL+1,^-  (5.ii) 

K€Tp 

The  constant  C  >  0  depends  only  on  FP,  but  is  independent  of  hp,  p p 
and  Sp. 


Proof.  The  L2 -estimate  (5.10)  follows  immediately  by  a  change  of  variables 
and  a  scaling  argument  from  Theorem  5.3. 

For  the  gradient  estimate,  we  observe  that 

l|v(u  -  nu) ||2, P  <  c(*»(||(u  -  nu)  o  Fp||2  -  +  ||v((«  -  nu)  o fp) ||2p)  . 
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For  the  first  term  we  use  (5.10),  for  the  second  one  we  use  (5.8),  after 
scaling  to  the  reference  element: 

l|V((u  -  IIu)  o  Fp)\\^  p 

=  £  ||a1((u-/7U)oFp)||^  +  ||a2((u-iTU)oFp)||^ 

Refp 

=  £  hr,Kh2,K\\di{I-nPK)uoFpoAK\\l~ 

Ketp 

i=l,2 


(5.8) 


f  ( PK  ~  SK)]- 


+ 


S'  w  E  + ii«'+V<*> 

K^Tp 

So^TT)  a**  Wft,  +  iA«*«MriSj0} 


where 


Uq,K  '•=  u  0  Fp  o  Ak  =  u  o  .A#,  IF  6  7p  . 


Affine  scaling  from  P  to  K  G  Tp  gives  the  assertion.  □ 

The  error  bounds  in  Theorem  5.4  simplify  for  uniform  p. 

Corollary  5.5.  (Uniform  order  estimate) 

Assume  that  u  :=  uo  Fp  €  Hk+1(P)  and  that  for  all  K  ETp 

Pk  =  P,  sk  =  s,  0  <  s  <  min(p,  k)  . 


Then  there  holds  for  IIu  6  SP'°(P,  Tp)  and  u  :=  u  o  FP 

\\u-nu\\lP<c^-<i>(p,s)  £  {ff)2s+2\ u\)+lR  (5.12) 

>  KeTp 

and 

||V(w  —  /7u)||ojP  <  C$(p,  s)  £  N2+1.JC-  (5.13) 

K€Tp 


Here  C  depends  only  on  the  patch  mapping  Fp  but  not  on  s,p,hj{. 
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Remark  5. 6.  (Anisotropic  error  estimates) 

We  note  in  passing  that  the  above  error  estimate  assumed  the  shape 
regularity  of  the  K  merely  for  convenience  -  in  fact  the  explicit  error  bounds  in 
Theorem  5.3  and  5.4  above  could  be  easily  generalized  to  anisotropic  element 
shapes  (with  edge-lengths  hix  and  /12 if)  and  even  to  anisotropic  polynomial 
degrees  pix,  P2K,  say.  Error  bounds  explicit  in  these  parameters  can  be 
deduced  by  inspecting  the  proofs  of  the  above  theorems. 

Theorem  5.4  addressed  only  discontinuous  approximations;  it  turns  out, 
however,  that  also  continuous,  piecewise  polynomial  approximations  can  be 
obtained. 

Theorem  5.7.  (Continuous  approximations) 

Let  Q  C  IR2  and  let  P  GV  with  a  1-irregular  mesh  consisting  of  shape  reg¬ 
ular  quadrilaterals  K  of  diameter  hx-  Let  the  polynomial  degree  be  uniform, 
Pk  —  p ■  Let  u\k  €  HkK+1(K)  for  some  kx  >  1  and  let  u  £  H2(P). 

Then  there  exists  a  projector  Liu  €  SP,1(P ,  Tp)  such  that  the  error  bounds 
(5.12),  (5.13)  hold,  with  a  possibly  different  value  of  C. 

Proof.  If  Tp  does  not  contain  hanging  nodes,  Tp  is  regular  and  we  take 
LI  =  LI  in  Theorem  5.4.  Since  II  was  constructed  elementwise,  the  proper¬ 
ties  (5.10),  (5.11)  together  with  the  assumption  that  u  £  H2(P)  give  the 
continuity  of  II u  in  P. 

Consider  now  that  Tp  contains  hanging  nodes.  A  typical  situation  in  the 
reference  mesh  Tp  is  shown  in  Figure  5.2  where  the  elements  have  been  scaled 
to  unit  size  for  convenience. 


(-1,1)  (1,1) 


Fig.  5.2.  Hanging  node  •  and  adjacent  elements 


Since  u  €  H2(P),  u  £  C°(P).  By  (5.6),  u  —  IIu  vanishes  at  the  points  x 
in  Figure  5.2.  Denote  by  [u  —  IIu}ij  the  jump  of  u  —  IIu  across  7 By  (5.9), 


hp-FEM  for  Fluid  Flow  Simulation  353 


the  jump  of  IIu  across  723  is  zero.  Since  u  G  C°(P),  [u  —  IIu]ij  =  —[IIu]ij. 
Further,  [I7u]y  G  Pp(jij). 

We  now  construct  a  trace-lifting  of  [IIu)  across  712  U  713  as  follows:  We 
SGt 

. 


V(£)  =  -(6  +  1) 


[77m]i3(£i)  on  K3 


Since  [II  11)23  =  0,  V  is  continuous  on  K2  U  K3  and 

llvr||i,(AuA)<c||[JJ„]||Hi(TOUrii)  (5.14) 

where  C  is  independent  of  p.  By  the  trace  theorem  and  since  u  G  C°  ( (Jj  Ki) , 
we  have 

II  P“]ll»J(,„u™,  =  II 1“  -  n,‘l 

<ll(u-i7«)+iytaUTO) 

+  ll(«-n«).||JIj  (5-15) 

3 

<  C  Y  llu  **  nuW m(Ki)  • 


where  ()±  denote  traces  from  £2  >  0  and  £2  <  0,  respectively.  By  construction 
V  +  IIu  is  continuous  on  and  across  712  and  713. 

We  define  _ 

f  IIu  on  K\ 

IIu  :=  <  ^  ^ 

I  Vi  +  IIu  on  K2U  K3  . 


Then,  on  K  :=■  Ki  U  K2  U  K3, 


l|V(«  -  fin) \\0tft  <  l|W||oAu^#  +  £  ||V(«  -  nu) ||0iA  . 


Using  (5.10),  (5.11)  we  get 

3 

ll(Vu  -  nu)  wl#  <cY  ll(v«  -  nuK  jt,  (5-16) 

»=i 

where  C  >  0  is  independent  of  p. 

If  the  Ki  are  not  of  unit  size,  we  may  scale  the  estimate  (5.13)  without 
incurring  h-powers.  Since 
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Hu\ dK  ~  H udK  > 

further  liftings  in  the  presence  of  additional  hanging  nodes  on  dK  can  be 
performed  in  the  adjacent  element  patches,  resulting  in  the  error  bounds 
(5.12),  (5.13)  with  a  larger  C.  □ 


5.2  Corner  singularities 

Corner  singularities  are  present  in  polygons  and  polyhedra  whenever  the 
governing  equations  contain  second  order,  viscous  terms,  but  also  appear  in 
certain  inviscid  problems  (see,  eg.  Figure  12  in  the  article  [19]).  A  recent  refer¬ 
ence  is  [33],  [42]  where  further  references  can  be  found.  We  address  the  hp- FE 
approximation  of  corner  singularities  -  although  these  singularities  have  very 
low  regularity  at  the  corner,  exponential  convergence  results  are  nevertheless 
possible.  To  present  ideas  in  the  simplemost  setting,  we  start  in  dimension 
one  (where  corner  singularities  do  not  arise  in  practice),  continue  in  the  2-d 
case  and  comment  finally  on  the  3-d  case,  where  2  types  of  singularities,  edge 
and  vertex  singularities,  must  be  distinguished. 


One  dimensional  case.  In  I  =  (0, 1)  a  typical  corner  singularity  function 
is  given  by 

s(x)  =  g(x)rx  (5.17) 

where  r(x)  =  |m |  and  g(x)  is  analytic  in  [0,1].  The  singularity  exponent  A  is 
not  an  integer  and  it  must  hold  that 

A  >  1/2 

to  ensure  that  s(x)  has  finite  energy,  i.e.  that  1 1 s 1 1 x  f  <  oo.  Typically,  A  is 
small.  For  example,  for  A  <  3/2  the  singular  function  s(x)  $  K2(I)  and 
finite  difference/  finite  volume  methods  on  uniform  meshes  can  not  even 
achieve  first  order  convergence  in  Likewise,  spectral  methods  which 

approximate  s(x)  on  I  by  increasing  the  polynomial  degree  k  will  produce 
low  algebraic  convergence  rates  such  as  (see,  e.g.  [58]) 

||s-s*||f>/  <  CAr(2(A_f)+1),  k  =  1,2,...,  £  =  0,1.  (5.18) 

Nevertheless,  s(x)  is  analytic  on  the  set  (0, 1],  so  the  low  rate  (5.18)  is  caused 
solely  by  the  point  singularity  at  x  =  0.  hp-FEM  exploit  this  piecewise  ana- 
lyticity  as  follows. 

Consider  the  sequence  of  geometric  meshes  Tn,a  with  n  layers  and  ge¬ 
ometric  grading  factor  a,  0  <  a  <  1,  and  polynomial  degrees  kj<  shown  in 
Figfifetfcg.that  here  the  grading  factor  <7  =  0.5  and  that  the  number  n  of  re¬ 
finements  is  proportional  to  the  maximal  polynomial  degree  km&x  =  ma x{kx  ■ 
K  €  Tn'a}  in  the  mesh:  As  n  increases,  mesh  and  polynomial  orders  change 
simultaneously.  We  have  the  following  hp-approximation  result: 
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0 

1  1 - 1 - 

0  1 

2  I - t - 

0  1  2 


0  12  3 

4  I - 1 - 1 - 1 - 

0  12  3  4 

5  HH - 1 - 1 - 


Fig.  5.3.  Geometric  mesh  and  polynomial  degree  distribution  k  for  root  singularity 
at  x  =  0  (discontinuous  polynomials) 


Theorem  5.8.  Consider  the  root  singularity  s(x)  in  (5.17)  defined  in  I  = 
(0, 1).  Let  Tn'°  =  {K?:j  =  1, . . . , n},  K?  =  (0,  an~l),  Kf  =  (an~i+1 , an~i), 
j  =  2, ...  ,n  be  the  geometric  mesh  with  n  layers  and  grading  factor  0  <  a  <  1 
as  in  Figure  4-1-  Let  the  degree  distribution  k  (n)  =  {&"  :  j  =  1, . . .  ,n]  sat¬ 
isfy  k'j  >  p(j  —  1)  for  some  p  =  fj.(cr)  >  0  sufficiently  large.  Then  for  every 
n  there  exists  a  (possibly  discontinuous)  polynomial 

sn(x)€5„:=Sk(n)’°(J,TTl’‘T) 

satisfying  the  error  bound 

||s  -  s„||i2(/)  <  Ce~bn  <  Ce~b ^  (5.19) 

where  N  =  dim(5„)  =  0(n2)  and  C,b  are  independent  of  n  (but  depend  on 
a  and  a). 

If  kj  >  fij,  (5.19)  still  holds  with  an  sn(x)  €  Sk^'1(I,Tn’(T). 

If  the  polynomial  degree  is  uniform,  i.e.  kf  =  k  =  n  for  all  n,  then  (5.19) 
still  holds,  with  possibly  different  constants  b  and  C . 

For  a  proof,  we  refer  for  example  to  [58]. 
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Theorem  5.8  shows  that  by  judicious  combination  of  mesh  T  and  degree 
vector  k,  exponential  convergence  can  be  achieved.  Mesh  refinement  or  order 
increase  alone  yield  only  algebraic  convergence  rates.  Similar  results  hold  also 
when  the  pointwise  error  is  of  interest. 


Two  dimensional  case.  Consider  a  polygon  Q  C  IR2  as  shown  in  Figure 
3.1.  A  corner  singular  function  S(rj,ipj)  at  vertex  Pj  is  as  in  (3.2).  To  simplify 
the  notation,  we  may  assume  that  Pj  —  O  and  that  r{x)  —  rj( x)  =  |as|.  Then 
there  holds  again  an  exponential  convergence  result. 

Theorem  5.9.  Let  0  be  a  polygonal  domain  containing  the  origin  O  as  a 
vertex  and  let  uSing  =  S(r,(p )  be  a  singular  function  as  in  (3.2).  Let  0  < 
o  <  1  and  {Tn,<7}n  be  a  sequence  of  geometric  meshes  refined  towards  O 
with  n  layers  (see,  e.g.  Figure  5.3)  and  grading  factor  a,  0  <  o  <  1.  Let  the 
polynomial  degree  k  be  uniform  and  proportional  to  the  number  of  layers,  i.e. 
k  ~  n.  Then,  for  every  n  exists  a  continuous,  piecewise  polynomial  function 
un(x)  €  Sk’1(f2,Tn,a)  such  that 

||S(r,  ip)  -  «n|U.(fl)  <  Ce~bn  =  Ce~bNl/3  (5.20) 

where  b,  C  >  0  are  independent  of  N  =  dim (Sk'1(0,Tn,<T)),  the  number  of 
degrees  of  freedom  of  Sk'1(Q,Tn'<T). 

For  a  proof,  we  refer  for  example  to  [34],  [58]. 

Remark  5.10.  We  emphasize  that  uniform  polynomial  degree  k  is  not  neces¬ 
sary  -  it  suffices  in  fact  to  allow  k  =  2  in  the  element  abutting  at  O,  and  to 
let  k}(  increase  linearly  with  the  number  of  elements  K'  £  Tn,cr  between  K 
and  O. 

Remark  5.11.  Theorems  5.8  and  5.9  give  exponential  convergence  for  any 
0  <  a  <  1.  There  arises  the  question  for  the  optimal  a.  In  one  dimension, 
one  can  show  that  <r0pt  =  (>/2  —  l)2  =  0.17...  is  optimal  regardless  of  the 
strength  of  the  singularity.  In  two  dimensions,  no  analytical  result  is  known, 
but  also  here  geometric  meshes  with  grading  a  k,  <jopt  outperform  meshes 
with  other  values  of  <r,  see  also  Figure  10.11  below. 

If  w  contains  more  than  one  vertex  as  e.g.  in  Figure  3.1,  (5.20)  still  holds 
if  at  each  vertex  (reentrant  or  not)  a  geometric  mesh  patch  T”,<T  is  used. 

Remark  5.12.  For  (5.20)  to  hold,  it  is  not  necessary  that  the  domain  fl  is 
a  straight  sided  polygon.  The  same  result  holds  also  for  curved  domains, 
see  [2], 
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Remark  5.13.  In  three  dimensions,  at  vertices  the  construction  is  analogous, 
whereas  at  edges  of  polyhedra  Q  C  IR3  the  geometric  mesh  refinement  is 
anisotropic  towards  the  edge.  The  resulting  geometric  meshes  contain  in  the 
vicinity  of  edges  the  so-called  “needle  elements”  of  aspect  ratio  1  :  ap  -  this  is 
necessary  to  achieve  exponential  convergence  in  three  dimensional  polyhedra. 
Geometric  refinement  towards  the  edge  with  /c-uniform  meshes  will  not  give 
exponential  convergence  rates  (see  [3]  and  Remark  5.18  for  more). 


5.3  hp-Boundary  layer  resolution 

Analogous  to  corner  singularities,  hp-FEM  can  deal  very  effectively  with 
boundary  layers  and  viscous  shock  profiles  as  introduced  in  Sections  3.2  and 
3.3.  Here,  we  collect  the  main  mesh  design  principles  and  convergence  results 
for  the  /ip-FEM  for  these  problems  (see  also  the  references  [59],  [61],  [45]  for 
proofs  and  further  details). 

Boundary  layers  are,  like  corner  singularities,  essentially  one-dimensional 
phenomena;  therefore,  we  first  address  the  /ip-FEM  for  boundary  layers  in 
one  dimension. 


One  dimensional  results.  On  the  interval  I  =  (0, 1),  consider  a  boundary 
layer  function  with  length-scale  d  >  0  satisfying  the  estimates  (3.6).  A  typical 
example  is  the  (ubiquitous)  exponential  boundary  layer  ufe(x)  =  exp(— x/d). 
For  the  /ip-FEM,  we  have  the  following  result  [59]. 

Theorem  5.14.  In  I  =  (0, 1),  consider  the  exponential  boundary  layer  func¬ 
tion 

ube(x)  =  exp  {-x/d)  . 

For  0  <  k  <  4/e,  0  <  d  <  1,  and  k  =  1, 2, . . .  letTk  be  a  sequence  of  meshes 
defined  by 

Tb *  =  {(0,/t fed),  (/c kd,  1)}  if  nkd  <  1 , 

7?  =  {(0,1)}  if  nkd  >  1 . 

Let  the  polynomial  degree  be  uniform  and  equal  to  k;  then  for  every  k  G  IN 
exists  u'l  €  Sk’1  (I,  Tbkt)  such  that  the  following  error  estimates  hold: 

I Wit  -  ujfe \\l*(i)  <  Cd1/2  exp (-bk) , 
hte  ~  WfcllifM/)  <  Cd-1/2  exp  {-bk) , 

II  «m  “  ukh^(i)  <  C  exp  {-bk) . 

Here  b,  C  >  0  are  constants  which  are  independent  of  d,k,  but  depend  on  k. 
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We  see  that  in  the  presence  of  boundary  layers,  2  elements  are  sufficient 
for  robust  exponential  resolution  of  boundary  layers  in  the  context  of  the 
hp-FEM.  Note,  however,  that  the  size  of  the  smaller  element  is  crucial  -  it 
must  be  proportional  to  kd;  the  precise  value  need  not  be  achieved  and  the 
constant  C  does  not  depend  sensitively  on  k,  as  the  results  in  Figure  5.4 
show.  In  figures  5.5  -  5.7,  we  see  the  comparison  of  various  finite  element 
methods  in  terms  of  the  error  vs.  the  number  of  degrees  of  freedom.  Low 
order  methods  with  uniform  meshes  as  well  as  spectral  methods  on  a  fixed 
mesh  are  clearly  inferior  to  low  order  methods  on  judiciously  refined  meshes 
which  in  turn  are  inferior  to  the  hp-FEM,  especially  at  very  small  values 
of  d. 


We  see  here  the  comparison  in  approximability  in  the  “Energy”  norm 

IMIb  :=  d|n|/fi(/)  +  ||«||l2(/) 

for  various  methods  -  here  I  =  (—1,1)  and  ud  was  as  in  (3.6).  We  clearly 
see  the  superiority  of  the  hp-FEM  over  all  other  approaches.  In  particular, 
for  small  values  of  d  the  only  way  to  get  high  accuracy  in  the  layer  at  a 
reasonable  number  of  DOF  is  the  hp-FEM. 


Relative  Error  in  the  energy  norm 
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Fig.  5.6.  Comparison  of  various  methods,  d  =  10  3 
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Fig.  5.7.  Comparison  of  various  methods,  d  =  10  6. 


Let  us  still  comment  further  on  theorem  5.14.  Two  items  seem  to  limit  the 
generality  of  the  result:  the  explicit  form  of  ufl(  and  the  specific  knowledge 
of  the  parameter  d  for  the  mesh  design.  In  fact,  both  prerequisites  can  be 
relaxed.  We  have  [45,59]. 

Theorem  5.15.  Let  the  boundary  layer  function  uff  on  I  =  (0, 1)  be  as  in 
(3.6).  Define  for  k  =  1,2,3, ...  the  mesh  Tk  as  in  (5.21)  above.  Then  there 
exists  uk  G  Sk’1(I,Tk)  on  I  such  that,  for  0  <  n  <  Kq, 

II «m  “  uk\\ £»(/)  +  KPd\\(ube  ~  «fc)'llt«(/)  <  Ce~bKk 

where  b,  C  >  0  are  independent  of  d,  k  and  k. 

Moreover, 

w*(0)  =  «w(0),  udk(  1)  =  ufa(  1)  . 

If  the  length  scale  e  of  the  boundary  layer  is  not  known  explicitly,  or  if 
several  length  scales  d\,d2,...,di  are  present,  these  scales  must  be  known 
explicitly  in  order  to  construct  the  hp-boundary  layer  meshes.  Moreover,  the 
FE-subspaces  are  not  hierarchic,  since  at  every  fc-increase  the  meshes  are 
changed.  This  is  overcome  once  more  by  means  of  geometric  meshes. 

Theorem  5.16.  On  I  =  (0, 1)  consider  a  boundary  layer  uf{  of  length  scale 
d  as  in  (3.6).  Let  n  G  IN  be  fixed  and  consider  in  I  the  geometric  mesh  Tn,a 
with  n  layers  and  grading  factor  a .  Assume  scale  resolution,  i.e.  that 
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for  some  c  >  0.  Then  there  are  C,  r  >  0  such  that  for  every  k  G  IN  there  is 
uf  G  Sk’1(I,Tn'°)  with 

lluM-«*IU“>(l)+dll(«M-“fc)'llL“(/)  <  Ce~Tk  ■  (5.23) 

For  a  proof,  see  [45]. 

As  compared  to  Theorem  5.15,  we  have  an  additional  condition  (5.22); 
in  the  context  of  hp-FEM,  scale  resolution  is  not  a  very  severe  condition, 
since  geometric  mesh  refinement  allows  to  resolve  extremely  small  scales 
with  few  layers.  For  example,  let  d  -  10-10  and  a  =  0.1.  Then  L  —  10 
layers  will  suffice  in  (5.22).  More  generally,  we  get  scale  resolution  provided 
that  L  =  0(log(T(d)),  a  weak  requirement  if  compared  to  uniform  mesh  re¬ 
finement  necessary  for  low  order  elements;  even  adaptive  low  order  elements 
will  require  considerably  more  DOF  (in  terms  of  small  d )  to  resolve  the  small 
scales. 


Two  dimensional  results  (smooth  domain).  The  previous  results  on 
one-dimensional  hp-boundary  layer  resolution  apply  immediately  to  bound¬ 
ary  layers  of  the  form  (5.5).  The  main  idea  is  now  to  use  a  tensor  product 
mesh  with  anisotropic  element  that  are  aligned  with  the  layer.  The  following 
figure  shows  in  detail  this  construction. 


Fig.  5.8.  Boundary-fitted  elements  in  f2o- 

If  we  now  look  at  the  components  of  (3.7),  we  see  that  the  boundary 
layer  effect  is  still  only  a  one-dimensional  one,  in  the  direction  of  p  (the 
functions  Tj  (6)  being  smooth) .  Hence  we  may  define  boundary-fitted  elements 
(as  shown  in  Figure  5.8)  on  f?0-  We  do  this  by  dividing  df2  into  subintervals 
(#i,  #i+i),  1  <  *  <  to— 1, 6  G  dfi  and  drawing  the  inward  normal  at  9^,1  <  i  < 
to,  of  length  po-  Then  the  points  (po,6i)  are  connected  by  the  curve  p  =  po- 
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Each  curvilinear  quadrilateral  S  =  ABEF  is  then  further  subdivided  into 
two  elements  Si  and  S2  by  the  curve  p  =  k kd,  according  to  the  prescription 
in  the  previous  section.  Looking  at  ABEF  in  the  (p,  9)  coordinates  then  gives 
two  rectangular  elements  Si  =  A'B'C'D'  and  S2  =  D'C'E'F1  as  shown  in 
Figure  5.8.  The  local  polynomial  space  on  S* ,  i  =  1,2  is  then  defined  (using 
the  notation  v(x,y)  —  v(p,9)  for  (x,y)  =  (x(p,9),y(p,9)))  by 

Qk{Si)  =  {v(x,y):v(p,0)eQk(Si)}. 


Note  that  the  basis  functions  we  use  are  polynomials  in  (p,  9)  instead  of  in 
ix,y). 

Consider  the  local  approximation  of  (3.7)  over  the  space 
Vk(S)  =  {v£C°(S):v\Si  eQk  (Si)}. 


The  function  t*  being  smooth,  is  approximated  exponentially  by  a  piecewise 
polynomial  rf  (9)  of  degree  k.  The  boundary  layer  function  exp  (—ap/d)  is 
approximated  at  an  exponential  rate  by  a  piecewise  polynomial  v  (p),  of  de¬ 
gree  k  —  q,  as  in  Theorem  5.7.  Then,  for  q  fixed,  k  large  enough,  we  obtain 
by  a  simple  tensor  product  argument 


Z(P>8)  ~J2Ti  (°)  Piv(p) 


t=0 


<  Cd1/2  exp (-bk)  (5.24) 


E,d,S 


so  that  the  local  approximation  in  the  energy  norm  is  the  same  as  that  in 
the  one-dimensional  case. 


Remark  5.17.  So  far,  we  considered  only  boundary-fitted  meshes.  Analogous 
results  are  also  valid  for  more  general,  properly  refined  triangulations  at  the 
boundary  [46]. 

Similar  arguments  apply  also  for  the  other  results,  Theorems  ((5.15))  and 
5.16,  if  they  are  combined  with  high  order  polynomial  approximation  on  large 
elements  along  the  layer  /front. 

Remark  5.18.  (on  anisotropic  refinement)  We  emphasize  here  that  anisotropic 
mesh  refinement  is  a  conditio-sine-qua-non  for  the  robust  exponential  conver¬ 
gence  of  fip-FEM  in  the  presence  of  boundary  layers  and  edge  singularities 
in  polyhedra;  isotropic  refinement  will  not  suffice,  since  e.g.  in  shape  regu¬ 
lar  geometric  meshes  the  number  of  elements  (and  therefore  the  number  of 
degrees  of  freedom)  will  increase  exponentially  (see  Figure  5.9). 


Boundary  layer-corner  singularity  interaction.  The  above  remarks  ap¬ 
ply  only  for  a  smooth  boundary  resp.  near  a  smooth  boundary  segment.  For 
flow  problems  with  small  viscosity  or  the  reaction-diffusion  equation  (2.7) 
with  small  diffusion  constant  e  in  a  polygon.  Here  boundary  layers  appear 
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Fig.  5.9.  Geometric  isotropic  and  anisotropic  refinement  towards  curve  C 


near  smooth  boundary  segments,  corner  singularities  near  vertices  and  corner- 
layers  in  the  transition  region  between  corner  and  boundary  segment.  All 
these  effects  are  resolved  by  /ip-FEM  based  on  the  ( k ,  <r)-meshes  in  Section  4 
(see  in  particular  Figures  8-9  in  Chapter  4).  We  conjecture  that  the  solution 
of  (2.7)  in  polygonal  domains  for  0  <  e  <  1  can  be  approximated  robustly  at 
an  exponential  rate.  This  is  corroborated  also  by  numerical  experiments. 

In  the  convection-diffusion  problem  (2.11)  the  additional  difficulty  arises 
that  the  dominant  transport  terms  propagate  the  effect  of  corner  singularities 
into  the  domain  along  characteristics.  For  positive  e  >  0  at  vertices,  the  typi¬ 
cal  corner  singularities  arise  which  generate  so-called  characteristic  boundary 
layers  along  characteristics.  For  piecewise  analytic  data,  the  singular  support 
of  the  solution  contains  characteristic  lines  which  changes  the  length-scale  of 
the  layers  associated  with  these  lines.  Schematically,  this  is  shown  in  Figure 

5-%t  the  outflow  boundary 

r+  =  {x  e  r  p-  n(x)  >  0} 

we  have  an  outflow  boundary  layer  of  width  0(e),  whereas  along  the 
characteristic  sets 

C  :=  {x  e  fl  \  x(s)  =  P(x(s)),  x(0)  =  Pi,  i  —  1, . . .  M}  , 

i.e.  the  union  of  integral  curves  (contained  in  fl)  of  the  advection  field  fi(x) 
through  the  vertices  0(\fe),  so-called  parabolic  layers  arise.  Notice  that 
the  corner  singularities  at  inflow  vertices  Pi  €  JL  :=  P\r+  (we  assume  here 
that  P  does  not  contain  characteristic  segments)  influence  these  layers;  their 
precise  regularity  is,  even  in  the  linear,  2  —  d  case,  still  under  investigation. 

In  Figure  5.10,  the  lines  in  C  are  straight  since  the  field  (3  is  constant.  In 
general,  these  lines  are  curved  for  variable  (3  =  f3(x)  and  the  hp-mesh  design 
must  be  anisotropic  and  geometric  towards  C  in  order  to  achieve  exponential 
convergence  (see  Remark  5.18). 

Similar  remarks  apply  also  to  the  viscous  shock  profiles  introduced  in 
Section  3.3. 
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Fig.  5.10.  Convection-Diffusion  problem  (2.11)  in  a  polygon  12-length  scales  of  the 
layers 


Part  II 

hp  discretization  techniques 

The  convergence  of  any  numerical  method  is  based  upon  consistency  of  the 
approximation  and  upon  stability  of  the  discretization.  We  have  seen  in  the 
first  part  that  hp- FEM  can  achieve  exponential  approximation  rates  for  typi¬ 
cal  flow  viscous  features;  this  requires  the  combination  and  simultaneous  vari¬ 
ation  of  polynomial  order  and  strong,  possibly  anisotropic,  mesh-refinement. 
There  arises  the  question  on  how  to  stably  discretize  CFD  problems  with 
hp-approximations.  This  part  of  the  notes  deals  with  the  most  important 
discretization  techniques  for  such  problems.  All  discretizations  are  based  on 
some  form  of  Galerkin  projection  upon  the  ftp-subspaces.  This  methodology 
is  well-established  in  solid  mechanics  where  stable  variational  principles  for 
most  problems  are  readily  available.  In  CFD  we  have  to  deal  in  particular  with 
strongly  advection  dominated  problems  for  which  the  usual  Galerkin  type  dis¬ 
cretizations  do  not  exhibit  good  stability  properties.  To  ensure  robustness,  we 
must  therefore  resort  to  non-standard  -  from  the  point  of  view  of  symmetry  - 
discretization  techniques  for  the  viscous  terms  such  as  finite-volume,  discon¬ 
tinuous  Galerkin  methods  or  in  particular  the  stabilized  Galerkin  schemes,  i.e. 
streamline-diffusion  FEM  (SDFEM)  and  the  Galerkin-Least  Squares  (GLS) 
techniques.  The  presentation  of  these  techniques  is  the  purpose  of  the  second 
part  of  the  notes. 
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6  Reaction-Diffusion 

We  consider  the  discretization  of  the  problem  (2.7),  (2.8).  Several  FE  dis¬ 
cretizations  are  presented,  each  based  on  specific  variational  formulation  of 
(2.7),  (2.8).  Of  particular  interest  are  discontinuous  approximations  which 
can  be  used  with  the  discontinuous  Galerkin  technique  for  first  order  prob¬ 
lems,  see  Sections  7  and  12  ahead. 

6.1  Standard  (continuous)  Discretization 

We  assume  first  that  the  Dirichlet  data  /  in  (2.8)  is  zero  and  introduce  the 
space 

H1D{f2):={u€H1(fl):u  =  0  on  rD}  .  (6.1) 

The  variational  function  of  (2.7),  (2.8)  with  general  A(x)  is 

u  €  H^Q)  :  a(u,v)  =  £(v)  W  €  H^fi)  (6.2) 

where  the  forms  are  defined  by 

a(u,v)  :=  e(Vv,  A Vu)n  +  ( v,u)n  , 
t(v)  :=  (S,v)o +  (g,v)rN  ■ 

The  form  a(-,  •)  is  symmetric  and  coercive,  i.e. 

a(u,  u )  >  IMI’  :=  e  ||Vu||g>n  +  ||u||jj>n  >  0  (6.3) 

if  u  ^  0,  due  to  (2.4). 

The  discretization  of  (6.2)  is  obtained  by  restricting  u  and  v  to  FE  sub¬ 
spaces  of  in  order  to  achieve  it,  subspaces  of  continuous,  piecewise 

polynomial  functions  must  be  chosen.  We  have 

«fe  €  5p’1(f?,T,Fp)  : 

o(uP  E,v)  =  l(v)  Vu€Sk-1(rt,T),  Fp). 

Here 

Sp’1  :=  5k'!  n  Hxd  . 

Let  N  =  dimlS1^’1}  and  {ipi  :  i  =  1, . . . ,  N}  be  a  basis  for  Sjj1.  Then  (6.4)  is 
equivalent  to  the  linear  system 

Ax  =  £ 

where  the  entries  of  the  diffusion-stiffness  matrix  are  given  by 

Aij  =  a(ipi,  ipj)  =  1  <  i,j  <  N 

and  the  entries  of  the  load  vector  l  are  lj  =  £(ipj). 

The  diffusion  matrix  is  symmetric  and  positive  definite  and  must  be  eval¬ 
uated  by  numerical  quadrature  of  sufficiently  high  order,  in  particular  if  the 
elements  are  curved,  see  [41],  [65],  [66],  [67]  and  [46]  for  quadrature  techniques 
and  error  estimates. 
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6.2  Mixed  discretization 

The  continuity  of  the  FE  solution  ufe  in  (6.4)  is  restrictive  -  in  connection 
with  finite  volume  methods  for  convection  dominated  problems  or  discontin¬ 
uous  Galerkin  methods  it  is  desirable  to  admit  discontinuous  approximations 
for  ufe-  To  this  end,  the  variational  formulation  (6.2)  must  be  changed.  We 
write 

-div  q(Vu)  4 -  au  =  S  in  ft ,  (6.5) 

where  the  flux  q  is  given  by 

q(Vu)  =  AVu  in  ft .  (6.6) 

We  get  the  (dual)  mixed  variational  formulation: 

find  u  €  L2{fi)  and  q  £  if  (div,  i?)  such  that  n  •  q  =  g  on  TV  and 

(v,< 7u)q  -  (v,  div  q)n  =  (S,v)  Vv  £  L2(ft) , 

(6.7) 

(u,V  -p)fi  +  (p,q)fi  =  (/,n  ■  p)rD  Vp  e  F(div,  ft) . 


Here  if  (div,  ft)  is  defined  as  follows: 

if  (div,  ft)  :=  {q  €  L2(ft)d  :  div  q  £  L2(ft)}  ,  (6.8) 

where  the  divergence  is  understood  in  the  weak  sense. 

The  mixed  FE  discretization  of  (6.7)  is  based  on  subspaces  Sk'°{ft,T, 
Fp)  C  L2{fl)  and  5kiv(/2,  T,Fp)  C  if  (div,  17);  now  ufe  can  be  discontinu¬ 
ous,  but  d  components  of  the  flux  q  must  be  discretized;  the  finite  element 
fluxes  qpE  must  have  a  continuous  normal  component  across  element  inter¬ 
faces,  but  their  tangential  component(s)  may  be  discontinuous. 

The  linear  system  corresponding  to  (6.7)  has  the  form  (for  constant  a) 


where  M  is  the  i2-mass  matrix  of  u,  C  is  the  mass-matrix  of  q  and  B, 
Bt  correspond  to  the  nonsymmetric  forms  of  (6.7).  In  the  conjunction  with 
an  explicit  time-stepping  strategy,  the  spectrum  of  the  matrix  in  (6.9)  is  of 
interest.  We  have 


aM  -B 
Bt  C 


=  ctut  Mu  -  uT  Bq  +  qT  BT  u  +  qT  Cq 
=  cruT  Mu  +  qT  Cq  >  0 


(6.10) 
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if  cr  >  0,  u  ^  0,  q  ^  0,  i.e.  the  matrix  has  eigenvalues  with  positive  real  part, 
if  a  >  0,  so  that  this  discretization  is  dissipative. 

If  a  =  0,  stability  of  (6.9)  is  not  guaranteed  in  general.  In  this  case, 
we  must  require  a  compatibility  condition  of  the  spaces  Sk>0  and  5kv,  the 
so-called  discrete  inf-sup  condition : 


V0  y£ueSk'°  : 


sup 

0#q€Sjiv 


(u,  div  q) 
||div  q||0,r? 


>7  IMIo.fi 


(6.11) 


for  some  7  >  0. 

An  example  of  an  element  family  satisfying  (6.11)  is  the  so-called  discrete 
Raviart-Thomas  family  (see  [16],  Chapter  III  for  more). 


6.3  Mortar-Discretization 

The  mixed  discretization  has  the  disadvantage  that  for  each  component  u  of 
the  flow  field  d  additional  fluxes  must  be  discretized  leading  to  a  large  number 
of  unknowns.  Another  approach  is  to  use  discontinuous  u  and  to  penalize 
the  interelement  jumps  by  Lagrange-Multipliers  on  the  element  interfaces, 
leading  to  the  so-called  Mortar  Element  Method  (MEM) .  Some  relevant 
references  are  [11],  [8]  and,  for  the  hp- MEM,  [63]. 

We  describe  the  MEM  for  the  model  problem 

-div  q(Vu)  +  ou  —  S  in  Q , 

u  =  0  on  To  , 
n  •  q(Vu)  =  g  on  T/v  . 

Here  a  >  0  and  the  flux  q(Vu)  is  as  in  (6.6).  Let  T  be  a  mesh  in  i?  built 
out  of  regular  patch  meshes  Tp,  which  are  possibly  irregular  across  patch 
interfaces  for  K,K'  €  T  with  intersection  Fkk1  of  positive  d  —  1  dimensional 
measure.  In  the  MEM,  we  use  the  standard  variational  formulation  (6.2)  with 
discontinuous  u,v  G  .5,k’°(J7,  T,  Fp). 

The  bilinear  form  a(-,  •)  must  be  reinterpreted  then,  since  the  H1( J?)  norm 
is  not  defined  for  u,v  €  S^j°. 

Broken  Bilinear  Forms  and  Spaces.  We  reformulate  therefore  (6.2)  for 
piecewise  Hl  -functions  on  the  partition  T  and  set 

H\Q,T)  :=  {u  G  L2{n)  :  u\K  G  H\K)  VAT  G  T}  ,  (6.14) 

equipped  with  the  broken  seminorm  and  norm 

Mi, fi.r :=  Iloilo,*,  IMIi, fi,r  =  IMI?,#  ■ 

K€T  K€T 


(6.12) 

(6.13) 


(6.15) 
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To  generalize  (6.2)  to  u,r€  S^’°.  we  also  introduce  the  broken  bilinear  form : 
for  u,v  €  H1(Q,T),  set 

ar(u,v):=  e(Vu,  AVu)0,k  +  (v, u)0,K  =  aK(u,v)  .  (6  16) 

irer  Ker 

Variational  formulation  in  broken  spaces.  We  derive  an  analog  to  (6.2) 
in  broken  spaces.  Let  K  E  T  be  any  element.  Multiplying  the  equation 

-div  q(Vu)  +  au  =  S  in  K 

by  v  E  H1  (K)  and  integrating  by  parts  on  K  gives 

e(Vv,  AVu)0,k  +  <r(u,  v)0>k  =  (S,  v)0,k  +  (v,  n  •  q*r  )0 ,9K  •  (6.17) 

To  get  a  variational  formulation  of  (6.2),  we  sum  (6.17)  over  all  K  E  T,  giving 

ar(u,v)  =  (S,v)o,n  +  (w,n k  ■  qK)o,dK  (6.18) 

KeT 

where  nx  is  the  exterior  unit  normal  to  K  E  T  and  qx  =  q|  k-  Denote 
by  the  Skeleton  <Sjnt  the  union  of  all  element  intersections  of  positive  d  —  1 
dimensional  measure. 

5int  =  {e  =  fnF6£  :K,K'  ET,J  ds  >  o}  (6.19) 

and  set  e 

Sd  ■=  {e  E  €  :  e  C  Td}  (6.20) 

where  £  is  the  set  of  all  d  —  1  dimensional  element  boundary  segments. 

For  the  exact  solution  u  of  (6.2),  the  fluxes  njc  •  q(Vu)  are  continuous 
across  edges  e  E  <Sint.  The  MEM  for  (6.12),  (6.13)  consists  in  enforcing  the 
vanishing  of  the  jumps  of  u  across  e  E  5;nt  as  follows: 
find  u  E  /x  E  M  such  that 

ar(u,v)  +bT(v,n)  =  (S,v)0tn  +  {g,v)0< rN  Vu  E  ffx(f?,  T) , 

16.21) 

bT(u,X)  =0  VA  EM. 

Here 

bT(u,\)  :=  ([u],A)0,e 

€E«Sjnt 

and  [u]  denotes  the  jump  of  u  E  f71(l7,  T)  across  e  E  <Sint.  The  Mortar  space 
M  is  a  multiplier  space  contained  in  Notice  that  by  (6.18), 

if  u  is  smooth,  the  mortar  fj,  in  (6.21)  will  give  the  canonical  flux  n/<  ■  q(Vu) 
on  e  E  <Sint-  Note  also  that  (6.21)  has  saddle  point  form,  similarly  to  the 
mixed  formulation  (6.7). 
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It  is  crucial  for  the  stability  of  (6.21)  that  •)  satisfies  a  suitable  inf-sup 
condition;  this  is  indeed  the  case,  see  e.g.  [8]. 

The  finite  element  discretization  of  (6.21)  is  as  usual: 

find  uFE  €  S%°(n,T,  F v),  fiFE  £  Afk'°(/2,T): 

ar(“FE,  v)  +  br(v,  (1fe)  =  (s,  v)o,n  +  ( 9 ,  v)o,rN  Vu  £  S%° 

(0.22) 

M“fe,A)  =0  VA  £  Mj- . 


Here  the  additional  mortar  space  Mk  enters,  similarly  to  the  flux-subspaces 
Skiv  C  H( div,  ft)  in  (6.7).  Several  choices  for  Mk  are  possible.  However,  care 
must  be  taken  that  the  forms  &-r(-,  •)  satisfy  a  discrete  inf-sup  condition 


inf  sup 
AeM£  vesk'° 


br{v,  A) 

IMIiAtIIAUm 


>7(T,k)>0 


(6.23) 


holds.  For  uniform  degree  k,  the  mortar  space 

M\  =  {A  €  L2(Sint)  :  A|e  G  Vk-i(e)}  (6.24) 


has  been  shown  (for  a  fixed  patch  mesh  T-p  allowing  in  particular  also  geo¬ 
metric  meshes)  in  [63]  to  have  an  inf  sup  constant  7 (T,k)  >  C(n)  fc~3/4  in 
two  dimensions. 

The  usual  theory  of  mixed  methods  (see  e.g.  [16])  implies  then  quasi 
optimal  error  bounds  for  u  as  well  as  for  the  fluxes  nFE. 

Remark  6.1.  Notice  that  the  degree  K  of  the  mortar  space  Mj-  in  (6.24)  is 
one  less  than  k  in  the  domain  -  the  lowest  degree  admissible  is  hence  k  =  1; 
no  variant  of  the  MEM  is  known  which  admits  k  =  0  in  the  elements.  In 
comparison  with  the  mixed  formulation  (6.7),  the  mortar  method  involves  less 
additional  degrees  of  freedom  -  only  fluxes  on  interfaces  must  be  discretized, 
rather  than  fluxes  in  the  elements.  Nevertheless,  the  mortar  approach  still 
involves  more  DOF  than  the  conforming  method  (6.4). 


Implementation  without  fluxes.  It  is  possible  to  eliminate  the  mortar 
Hfe,  A  from  (6.22),  thereby  reducing  the  number  of  unknowns.  The  idea  is 
to  restrict  «fe  and  v  £  Sp°.  If,  for  example,  [ufe]  =  0  and  [u]  =  0  on  Nint, 
so  that  ufe  and  v  are  continuous,  br  in  (6.22)  vanishes  and  we  get  again  the 
symmetric  formulation  (6.4)  (since  then  uFe,v  £  S^,’1)  i.e.  nothing  new. 

A  second  possibility  not  enforcing  interelement  continuity  is  use  (6.24) 
and  to  restrict  upE)  v  to 

SpS  —  ju  €  S^’°  :  Ve/c,if'  G  <S;nt  V(/>  £  Vk-i{eKK’)  ■ 

/  (u\K  -  u\K>)<pds  =  0} 

•>eKK’  J 


(6.25) 
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resp.  more  generally 


Sk/  =  {u£  S%°  :  \/eKK,  £  <Sint  £  M*  : 

/  («|jc-«|jf-)v,|eJfJf,*  =  0} 

*'eKR:/ 


(6.26) 


where  =  K  n  K'  for  K ,  K'  £  T. 

We  observe  that  on  any  eK,io  the  jump  [u]  belongs  to  Vk(eK,K')-  The 
orthogonality 

f  [u]ipds  \/ip  £  Vk-\{eK,K')  (6.27) 

•'e K.K ' 

consists  in  A:  =  dim  Vk-\{tK,K')  constraints  which  are  linear  combinations  of 
the  side  degrees  of  freedom  of  u\k  and  u\k’  ■  The  condensed  stiffness  matrix 
A  can  be  written  in  the  form 

A  =  QTdiag  {A*  :  K  £  T}  Q  (6.28) 


where  Ak  are  elemental  stiffness  matrices  corresponding  to  a^(u,  v)  in  (6.16) 
and  the  matrices  Q  contain  the  coefficients  of  the  constraints  (6.16).  In  iter¬ 
ative  solvers  for  Ax  =  b,  (6.28)  is  never  formed  explicitly  and,  in  particular, 
the  element  stiffness  matrices  Ak  could  reside  on  different  processors  during 
the  iterations. 

It  can  be  proved  that  the  bilinear  form  ar{u,  v)  is  coercive  on  S^s  and 
hence  the  matrix  A  is  positive-definite  [11],  i.e.  the  mortar  discretization 
(6.22)  preserves  dissipativity.  Note  however,  that  the  coercivity  constant  re¬ 
sulting  from  the  proof  in  [11]  depends  on  the  triangulation  in  an  unspecified 
way. 

Remark  6.2.  We  emphasize  that  the  MEM  presented  here  differs  from  the 
one  considered  in  [8],  [9],  [11],  in  that  we  allow  here  discontinuities  on  each 
edge  whereas  the  cited  works  treated  the  MEM  as  a  variant  of  the  domain 
decomposition  method  where  the  number  of  subdomains  coupled  by  the  mor¬ 
tar  is  fixed  and  mesh  refinement  with  conforming  elements  takes  place  within 
the  subdomains.  Clearly,  the  formulation  presented  here  is  more  general  and 
closely  related  to  FEM  with  Lagrangean  Multipliers  resp.  to  the  global  ele¬ 
ment  method. 

Remark  6.3.  Finally,  we  remark  that  the  MEM  with  eliminated  fluxes  coin¬ 
cides  at  least  in  one  case  with  a  known  method:  consider  on  T  consisting 
of  triangles  the  space  S1,0(n,T)  of  piecewise  linear,  discontinuous  functions. 
Choosing  the  mortar  space  Mj-  of  piecewise  constants  on  the  edges,  (6.28) 
implies  that  the  averages  of  the  jumps  of  u  £  S1,s  over  each  edge  must  van¬ 
ish  -  this  element  is  just  the  Crouzeix-Raviart  element.  Here  the  matrix  A 
is  coercive  independent  of  the  meshwidth. 
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6.4  Discontinuous  Galerkin  Method  for  second  order  problems 

The  DGFEM  allows  to  discretize  diffusion  problems  with  discontinuous  shape 
functions  without  extra  unknowns  due  to  fluxes  or  multipliers.  The  stiffness 
matrix  is  nonsymmetric  but  positive  semidefinite  which  is  desirable  for  ex¬ 
plicit  time  stepping  schemes. 


Derivation  from  the  Mortar  Method.  Closely  related  to  the  MEM  is 
the  Discontinuous  Galerkin  (DG)  method  for  the  problem  (2.7),  (2.8).  It  can 
be  derived  as  follows:  consider  (6.21).  Adding  the  equations,  we  get:  find 
T)  x  M  such  that 

Br{u,  ft;  v ,  A)  =  £(v,  A)  V(v,  A)  £  Hp(f2,  T)  x  M  (6.29) 


where  we  set 


Br(u,  At;  v,  A)  :=  ar(u,  v)  +  br(v,  n)  +  br(u,  A) , 

£(v,  A)  :=  ( S ,  v)0,n  +  ( g ,  v)0,rN 

and  where  M  is  a  suitable  mortar  space. 

(6.29)  is  equivalent  to  (6.21)  and  its  discretization: 
find  (ufejATe)  €  V75  such  that 

Bt(u FE,/tFE)  =  A)  V(u,  A)  €  E-f  (6-30) 

where 

V$  :=  S£°(ft,T)  x  Mk'°(G,T)  . 

The  discontinuous  Galerkin  FEM  consists  in  eliminating  a-priori  the  multi¬ 
pliers  /tFE  and  A  in  (6.30)  by  the  flux-averages:  on  ex, it'  C  iSint,  set 

Mfe  =  ^  (q k  ■  %  +  q K'  ■  tik)  =:  (q(Vtt)  •  ne)  (6.31) 

where  ne  is,  for  example,  the  exterior  unit  normal  tik  to  the  element  K  with 
higher  index  (any  other,  fixed,  choice  of  nx  would  do).  Analogously,  we  select 

A  =  —  (q(Vu)  ■  ne)  on  e  e  Sint  (6.32) 

and  get  the  DGFEM:  find  udg  €  5k;0  such  that 

Bdg(udg,v)  =  Y,  ay(MpG,r) 

y  r  (6.33) 

+  y  ««DG]<q(Vi>)  •  ne)  -  [v](q(VuDc)  ■  ne))  ds 

eG«Sint  e 

for  all  v  £  S^0.  Here  q(Vu)  =  sAVu,  cf.  (6.2). 
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The  minus  sign  in  (6.32)  is  crucial  -  taking  there  plus  gives  a  symmetric 
bilinear  form  which  is,  however,  indefinite  -  this  property  is  very  bad  for 
explicit  time  stepping  schemes. 

In  contrast,  the  form  jBdg(t)  in  (6.33)  is  nonsymmetric,  but  positive 
semidefinite,  i.e. 

Vu  €  ■H'1(f?,T)  Bug{u,u)  =  y^ajf(u,tr)  >  0  ,  (6.34) 

K 

i.e.  (the  real  parts  of)  the  eigenvalues  of  the  corresponding  stiffness  matrix 
Adg  are  nonnegative  and  the  DG  discretization  (6.33)  of  the  diffusion  oper¬ 
ator  will  be  dissipative  in  an  explicit  time-stepping  scheme,  an  observation 
due  to  Oden  and  Baumann  [49]. 


Stability  of  the  DG-method.  The  stability  of  (6.33)  is,  to  some  extent, 
an  open  problem.  We  prove  here 

Proposition  6.4.  Assume  that  T  is  a  quasi-uniform,  shape-regular  mesh  on 
fi  of  meshwidth  h  and  that  there  exists  c  >  0  such  that 

\/K  €  T  \/u€  H^K)  :  aK(u,u)>c\\u\\lK  (6.35) 

and 

VK  eT  \/u,v  €  HX(K) :  aK(u, v)  <  c-1  IMkMMli.tf  .  (6.36) 

Define  further  on  Hl{fl,T)  the  broken  norm 

INIr:=(ENlU)1/2-  (6‘37) 

K 


Then  there  holds 


inf  sup 
o^«esfc’°  o#t >esk'° 


mi>7>0 

IMIrlMIr-7 


and 

|-Bdg(m,w)|  <  C(k  +  l)h-1  IMIrlMIr- 


(6.38) 


(6.39) 


where  C,  7  >  0  are  independent  of  h  and  of  k;  they  depend  only  on  the 
shape-regularity  of  the  elements. 

Proof. 

1)  Given  u  €  Sk'°  C  H1(Q,T),  select  vu  =  u.  Then  ||u„||r  =  IMIr  and 

Bdg(u,vu)  =  Bvg(u,u)  =  ^2  aK{u,u ) 

K 

(6  35)  „  1 1 2  „  „2 

>  c2^/\\u\\h1{K)—c\\u\\t- 

K 
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2)  Let  u,  v  €  Sk,°.  Then  from  (6.33)  with  e  =  1  we  get 

|J5Dg(w,w)|  <  E  |ag(u,u)|  +  £llMlk.ll<n,-AV.,)||o,« 

K  e 

+  EZ  ilMilo.e  ii<rle  •  AVw)||o,e 
e 

(6.35)  _ 

<  c"1  E  IMk/dMIi,* 

K 

+  C  E*"1/aIMIiw  +'1)^_1//2|lwlli,iru/r' 

e 

<  c_1||u||r  ||w||r  +  C{k  +  l)h_1  ||u||r  ||u||r 

where  we  used  the  trace  inequality 

lKe<C'(IIV«||o,/dHlo,/c  +  h-1||w||^) 

and,  by  (6.36), 

IK  ■  i4Vt»||o,e  <  ||AV«||o,e  <  c(k  +  l)/rx/2  ||Vv||0>jk:  • 

□ 

Remark  6.5.  Note  that  (6.35)  rules  out  the  case  when  we  have  pure  diffu¬ 
sion,  i.e.  the  Laplacean.  Then  ax(u,u)  =  fK  |Vu|2  dx  and  (6.35)  is  violated. 
Moreover,  in  this  case 

£dg(u,«)  =  E  /  |Vu|2dz  =  0<=»ueS°’°(f2,T)  ,  (6.40) 

K  JX 

i.e.  the  bilinear  form  £?dg  has  a  large  kernel.  Note  also  that  for  diffusion 
problems  resulting  from  implicit  time  discretization,  (6.35)  is  usually  satisfied, 
see  Section  11  below  for  more. 

Remark  6.6.  In  one  dimension  an  inf-sup  condition  (6.38)  and  continuity 
(6.32)  with  constants  independent  of  h  and  k  holds  [49]  even  in  the  absence 
of  an  absolute  term.  In  our  case,  the  /ip- error  estimates  of  Section  5  apply 
with  a  loss  of  ( k  +  l)h-1. 

Remark  6.7.  The  form  aj<  in  assumption  (6.35)  is  as  in  (6.16),  with  e  =  1. 
Nevertheless,  the  argument  in  the  proof  goes  through  also  for  e  <  1  if  in  the 
definition  (6.37)  of  the  norm  ||°||#i(K)  is  replaced  by  e  |o|j  K  +  ||o||0  K. 

Remark  6.8.  In  terms  of  computational  efficiency,  the  DGFEM  (6.33)  has 
numerous  advantages  over,  e.g.  the  schemes  in  6.2  and  6.3.  For  example, 
since  continuity  is  only  weakly  enforced,  there  is  no  need  to  code  interelement 
constraints  any  more.  This,  in  turn,  allows  to  modify  the  definition  (4.5)  of 
the  FE  space  in  that  the  FE  space  on  element  K  6  T  need  not  be  defined  in 
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terms  of  parametric  element  mappings  Fx ■  Rather,  we  can  in  the  DGFEM, 
adopt  the  definition 

Sk’°(n,T){u  e  L2{Q)  :  u\K  G  VkK  for  K  £  T)  , 

i.e.  the  FE-spaces  may  be  defined  in  local  carthesian  coordinates.  Moreover, 
even  if  K  is  a  quadrilateral  element,  the  local  approximation  space  may  be 
Vk  rather  than  Qk . 

Remark  6.9.  We  have  seen  in  Proposition  6.4  that  in  the  hp-DGFEM  we  must 
generally  expect  a  loss  of  optimal  convergence.  It  can  be  shown,  however, 
that  this  loss  of  convergence  orders  can  be  overcome  by  a  stabilization  via 
penalization  of  the  interelement  jumps  -  a  device  going  back  to  J.  Nitsche  in 
1971.  There,  the  bilinear  form  BGG(u,  v)  in  (6.33)  is  modified  by  an  additional 
term  to 

BDG^(u,v)  :=  BDG(u,v)  +  7e  /  MM  ds 

where  7e  >  0  is  a  stabilization  parameter  to  be  selected.  The  resulting  method 
has  the  advantage  to  be  defined  also  for  k  —  0,  i.e.,  for  piecewise  constants. 
Judicious  choice  of  je  allows  to  recover  optimal  convergence  rates  in  the 
diffusive  case  ([37],  Section  4).  The  price  to  be  paid  by  the  penalization  of 
the  interelement  jumps  is  a)  increased  stiffness  and  condition  number  of  the 
discrete  problem  and  b)  loss  of  elementwise  conservation  property. 

7  Convection 

Contrary  to  the  reaction-diffusion  case,  the  convection  problem  (2.9)  and  the 
continuity  equation  (1.1)  are  first  order,  hyperbolic  equations.  Consequently, 
the  variational  formulation  underlying  the  hp-FEM  will  not  be  symmetric 
any  more  and  a  standard  Galerkin  approach  as  in  the  reaction-diffusion  case 
is  well  known  to  have  poor  stability  properties.  This  parallels  the  classical 
instability  of  central  differencing  for  the  linear  advection  equation.  To  obtain 
stable  discretizations,  some  sort  of  stabilization  must  be  introduced  into  the 
variational  formulation.  We  will  discuss  the  following  devices:  a)  streamline 
diffusion  techniques  and  b)  discontinuous  Galerkin  approximations. 

The  streamline  diffusion  method  was  introduced  by  Hughes  and  Johnson 
and  their  coworkers  in  the  early  80ies  in  order  to  combat  instabilities  of  C°- 
FEM  for  advection-dominated  flows  [36],  [38],  [39].  It  consists  in  replacing  the 
test  function  v  in  the  Galerkin  scheme  by  v  4-  6x£v  where  C  is  the  advection 
operator.  The  parameter  6  must  be  chosen  in  terms  of  the  discretization 
parameters,  i.e.  the  meshwidth  and,  in  hp-FEM,  also  in  terms  of  the  elemental 
polynomial  degree  kx-  This  so-called  stabilization  parameter  is  at  the  disposal 
of  the  analyst  and  can  be  adjusted  in  specific  computations,  but  for  each 
element  K  €  7~  there  is  a  coupling  to  hx  and  kx  which  ensures  the  optimal 
convergence  rates  for  first  order  problems,  both  in  h  and  k. 
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7.1  Model  convection  problem 

Let  1?  be  a  bounded  curved  polyhedral  domain  in  IRd,  d  >  2.  Given  that  a  = 
(ai,...,  aa)  is  a  d-component  vector  function  defined  on  i?  with  a*  £  C1  (7)), 
i  =  1, . . . ,  d,  we  define  the  following  subsets  of  T  =  dO 

LI  =  {x  £  T  :  a(x)  ■  n(a;)  <  0},  /+  =  {r  £  f  :  a(x)  ■  n(x)  >  0}  , 

where  n(x)  denotes  the  unit  outward  normal  vector  to  T  at  x  £  T.  It  is 
assumed  here  implicitly  that  in  these  definitions  x  ranges  only  through  those 
points  of  r  at  which  n(x)  is  defined;  consequently,  TL  and  F+  are  not  nec¬ 
essarily  connected  subsets  of  jT. 

For  the  sake  of  simplicity,  we  shall  suppose  that  F  is  non-characteristic 
in  the  sense  that  T-  U  r+  =  r. 

The  convection  problem  (2.9)  takes  the  form 


£u  =  a  ■  Vu  +  bu  =  S  in  i? , 
u  =  f  on  I"!  . 


(7.1) 


for  some  b  £  C(Q),  S  £  L2{Q),  f  £  L2(r_). 

This  problem  has  a  unique  weak  solution  u  £  L2(Q)  with  a  ■  Vu  £  L2(S1) 
and  the  boundary  condition  satisfied  as  an  equality  in  [Hq£2 (r^)}' . 

In  the  next  two  subsections  we  shall  formulate  the  hp-streamline  diffusion 
and  hp-discontinuous  Galerkin  finite  element  approximation  of  (7.1). 


7.2  The  hp-SDFEM 

The  hp-SDFEM  approximation  of  (7.1)  is  defined  as  follows: 
find  usd  €  S'k’1  such  that 

Bsd(usd,v)  {Cusd,  v  +  5£v)  +  (upE,^)r_ 

=  Fsd(^)  =  (S,v  +  8£v)  +  {f,v)r_  Vu  £  S*’1 

where  8  is  a  positive  piecewise  constant  function  defined  on  the  mesh  trian¬ 
gulation  T. 

In  (7.2),  (•,  •)  denotes  the  inner  product  of  and 

(w,v)r_  =  j  \a-n\wvds, 

with  analogous  definition  of  (■,  -)r+  and  associated  norms  ||  •  \\r_  and  ||  ■  \\r+- 
The  stability  of  the  hp-SDFEM  is  expressed  in  the  next  lemma. 
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Lemma  7.1.  Suppose  that  there  exists  a  positive  constant  Co  such  that 

b(x)  -  |  V  •  a.(x)  >  co,  x  e  17  .  (7.3) 

Then  usd  obeys  for  S  >0  the  bound 

llv^SDll2  +Co||USd||2  +  ||uSD||2r+  +  \  llusollk 

<ll^||2  +  i||5||2  +  2||/||2r_. 

Co 

Proof.  Select  v  =  usd  in  (7-2)  and  note  that 
(£usd,ust>)  +  (nsD,nsD)r_ 

(7.4) 

=  ((&-  \  V  'a)  usd,«sd)  +  |  ||«sD|lr+  +  |  ||«sD||r_  • 


Applying  (7.3)  here  and  using  the  Cauchy  Schwarz  inequality  on  the  right- 
hand  side  in  (7.2)  with  v  —  usd,  the  result  follows.  □ 


We  observe  that  the  bound  in  Lemma  7.1  controls  the  L2-norm  of  the 
discrete  solution  as  well  as  some  derivatives  of  it  in  the  advection  direction, 
provided  S  >  0.  We  see  that  S  =  0  gives  only  instability. 

Now  we  turn  to  the  error  analysis  of  (7.2).  We  begin  by  decomposing 


u  —  usd  =  (u  —  IIu)  +  ( IIu  —  usd) 

=  *?  +  £, 


(7.5) 


where  IIu  is  a  suitable  projection  of  u  into  5fc,1j  the  choice  of  the  projector 
II  will  be  deferred  until  later.  The  key  is  a  bound  on  £  in  terms  of  77;  the  final 
error  bound  on  u  —  usd  will  then  follow  from  bounds  on  the  projection  error 
r]  in  Section  7.4  below. 

Lemma  7.2.  Assuming  that  (7.3)  holds,  and  that  u  6  Hl{Q),  we  have 

||\/£C£||2  +  ||c£||2  +  ^  ||£|lr+  +  ||£|lr_ 

L  ^  /rj  /?\ 

<||^£7?--Lr7||2  +  4|N|2  +  2|M|2r+, 

where  c  €  C(fl)  is  defined  by 

c2(x)  =  b(x)  —  i  V  •  a(i),  x  e  fi  .  (7.7) 


For  the  proof,  we  refer  to  [36]. 
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7.3  Discontinuous-Galerkin  hp-FEM 


Given  that  K  is  an  element  in  the  partition  T,  we  denote  by  dK  the  union  of 
open  faces  of  K.  This  is  non-standard  notation  in  that  dK  is  a  subset  of  the 
boundary  of  K .  Let  x  £  dK  and  suppose  that  n(x)  denotes  the  unit  outward 
normal  vector  to  dK  at  x.  With  these  conventions,  we  define  the  inflow  and 
outflow  parts  of  dK,  respectively  by 

d-K  ={16  dK  :  a(i)  •  n(z)  <  0},  d+K  =  {x  £  dK  :  a.(x)  •  n(x)  >  0}  . 

For  each  K  £  T  and  any  v  £  H1  (K)  we  denote  by  v+  the  interior  trace  of 
v  on  dK  (taken  from  within  K).  Now  consider  an  element  K  such  that  the 
set  d-K\r-  is  nonempty;  then  for  each  x  £  d-K\r~  (with  the  exception 
of  a  set  of  ( d  —  1)  dimensional  measure  zero)  there  exists  a  unique  element 
K' ,  depending  on  the  choice  of  x,  such  that  x  £  d+K' .  This  is  illustrated  in 
Figure  7.1. 


Fig.  7.1.  A  point  x  such  that  x  £  d-K  and  x  £  d+K' 


Now  suppose  that  v  £  Hl{K)  for  each  K  £  T.  If  d-K  n  TL  is  nonempty 
for  some  K  €  T,  then  we  can  also  define  the  outer  trace  of  v  on  d-K\r_ 
relative  to  K  as  the  inner  trace  relative  to  those  elements  K'  for  which 
d+K'  has  intersection  with  d-K\r -  of  positive  (d—  l)-dimensional  measure. 
We  also  introduce  the  jump  of  v  across  d-K\F 1: 

[  V  ]  =  V+  —  V-  . 

Let  S  £  H1  (K)  for  each  K  £  T,  and  suppose  that  S  is  positive  on  each 
K  £  T.  Typically,  S  is  chosen  to  be  constant  on  each  K  £  T,  although  we 
shall  require  this  for  now. 


378  Christoph  Schwab 


Suppose  that  v,w  6  Hl{K)  for  each  K  e  T.  We  define 
Bdg{w,  v)  =  £  /  Bw  ■  ( v  +  5£v)  dx 

K  Jk 

—  £  /  (a-n)[ti/]u+ds  -  £  /  (a-n)u+u+ds 

k  Jd-K\r _  "V  Jd-Knr- 


(7.8) 


and  put 


<dg(«)  =  £  /  /  •  (v  +  SCv)dx  -£/  (a  •  n)  0U+ ds  .  (7.9) 

^  Jk  x  Jd-Knr. 

Note  that  the  term  in  (7.8),  (7.9)  is  a  stabilization  parameter,  for 
S  =  0,  we  get  the  usual  discontinuous  Galerkin  method.  The  hp-DGFEM 
approximation  of  (7.1)  is  defined  as  follows:  find  udg  €  Sk'°  such  that 


Sdg(udg,^)  =  4dg(v)  Vu  E  Sk,°  . 


(7.10) 


Next  we  study  the  stability  of  the  discrete  problem  (7.10). 

Lemma  7.3.  Suppose  that  there  exists  a  positive  constant  cq  such  that  (7.3) 
holds.  Then  udg  obeys  the  bound 


£  ||v^£udg||jc  +  coIIudgIIk  +  £  ||«dg  -  uDGlll-K\r_ 


K 


K 


+  £  lluDGlla+i<:nr+ +  2  £  ll«DGll|_.K-nr_  (7-11) 

K  K 

-  £  W^fWl  +  —  £  ll/ll  k  +  2  £  \\g\\d_Knr-  ■ 

K  °  K  K 

The  proof  can  be  found  in  [36]. 

Remark  7. 4.  This  bound  is  analogous  to  the  estimate  (7.3)  for  the  hp-SDFEM. 
We  now  discuss  the  error  analysis  of  hp-DGFEM.  We  write 
u  -  UDG  =  (u-  nu)  +  (ITu  -  Udg) 


V  +  Z 


(7.12) 


where  IIu  is  a  suitable  projection  of  u  into  S'k,° ,  to  be  chosen  below.  There 
is  an  analog  of  Lemma  7.2: 
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Lemma  7.5.  Assuming  that  (7.3)  holds  and  u  E  H1( K )  for  each  K  E  T. 
We  have  that 


E  H^ll*  +  E  Mk  +  E  H£+lla_jmr- 

K  K  K 

+  \  E  H^+lla+Jfnr+  +  \  E  IK+  ~  CWl_K\r- 

K  K 

<  E  ii  ^  + 4  E  imi* 

+  ^E  H7?+lla+i<'nr+ +  E  H7?'"lla--^\r_  • 


(7.13) 


The  proof  follows  by  elementary  manipulation  and  we  refer  to  [36]  for 
details. 


7.4  bp- Error  Analysis  of  the  DG-  and  the  SDFEM 

In  this  section,  we  shall  construct  the  Tip- approximation  projector  77  in  the 
error  estimates  (7.5)  and  derive  Tip-error  bounds  for  the  Tip-SDFEM  as  well 
as  for  the  Tip-DGFEM  introduced  in  the  previous  section.  The  bounds  are 
explicit  in  h  and  p  and  in  the  regularities  of  the  solution  and  allow  to  deduce  in 
particular  exponential  convergence  estimates  for  piecewise  analytic  solutions. 

We  are  now  in  position  to  present  error  estimates  for  both,  the  SD-  and 
the  DGFEM.  We  shall  use  the  following  mesh  dependent  norm  defined  by 


IMIIdg 


:=  E  {\\^CuWk  +  IMI*  +  lk+lll_/cnr_ 

K<=T 

+  2  Hw+ll9+K'nr+  2  Hu+  ~u  Ha_AT\r_}  • 


+  2  Hu+ll9+KTir+  2  Hu+  u  Ha_A:\r_}  • 

Notice  that  for  the  SDFEM,  the  last  term  vanishes.  Here  is  our  main  error 
estimate  for  the  Tip-DGFEM. 

Theorem  7.6.  ( Convergence  rate  of  the  hp-DGFEM) 

Let  fi  C  IR2  and  T,  V  be  as  in  Section  2  with  (possibly  irregular)  patch 
meshes  Tp,  P  E  V ,  consisting  of  shape-regular  quadrilateral  elements  of  de¬ 
gree  pk  >  1.  Select  the  stabilization  parameters  6k  according  to 

6  k  =  Tik/T^k  for  all  K  E  T  .  (7-15) 


llu  -  udgIUdg  ^  C  E  (_2_) 


hK\2sK+l  $(kK,sK )  |  ~ i2  , 

~2)  Tk  |uU+i,«  (7-16) 


where  C  >  0  depends  only  on  elemental  shape  regularity,  and  on  the  coeffi¬ 
cients  a,b,  but  is  independent  of  kx,SK,hK  a,nd  where  $(p,s)  is  as  in  (5.10). 
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Proof.  (7.12)  and  Lemma  7.5  imply 

|||u- wdgIIIdg  <  IIMIIdg  +  IIICIIIdg 

(7<3)  IIMIIdg  +  (  £  II SiPr,  -  S-ir,  fK)  h  +  2  (  £  IM&)  * 

K  K 

+  V2  (  £  llx+BtiOTr+)  *  +  (Dl’Mli-tf\r_)i 

K  K 

<  ( Y  \\5*Cri\\ k)  2  +  ( Y  inii k)  2 

K  K 

+  (ni^+lll-Knr-)  +  —fx  (Y.  ll^llLg-nrv) 

K  V  K 

+  {Y^+  -^Ha_Jf\r_)2 

+  (£11^11*  +  lir^Hjc)*  +2  (Sllc^)* 

K  K 

+  v^(ElMII,J(nrOi  +  (E  ll^-llLicxr-) 

K  K 

<  C  {  £4hv*?IIk  +  4lWlx  +  Ml*  +  ^HnWl 

K 

+  Y,  ll^lll+gnru-  +  Il^lll-Knr- 

K 

+  II7?  Ila_a:\r_  +  ll7?+ll|_K'\r_  } 

=  C(A  +  B)  s. 

where  C  depends  on  (a,  b) . 

We  select  rf  —  u  —  77 u  with  77  as  in  Theorem  5.4.  This  gives  the  bound 

A<Y  (^f)  K  $(kK’sK){6K  +  S]?  h2Kkx2)\u\2SK+1  R  ■ 

K 

To  bound  B,  we  must  estimate  Iloilo, ax-  We  use  the  inequality 
M\h{9K)  <  C(\\Vt,\\kMk  +  h-Klu\l)  V7 <  e  r 
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and  obtain  the  bound 

B<C^2  $(jPk,sk)*  ■  (Jy)  ${pk,sK)^  P~k\u\ 

+  hK1(^)2SK+2^K’s^PK2KK+hK 
=  CJ2  (l f )  K+  Pk 1  •  *(?*>  «*•)(!  +Pk)|w|*k+1,k  • 


Selecting  5k  as  in  (7.15)  concludes  the  proof. 


□ 


An  analogous  error  estimate  holds  true  for  the  hp-SDFEM. 


Theorem  7.7.  (Convergence  rate  of  the  hp-SDFEM) 

Let  Q  C  IR2  and  T,  V  be  as  in  Section  2  with  a  1-irregular  mesh  con¬ 
sisting  of  shape-regular  quadrilateral  elements  of  degree  px  >  1-  Select  the 
stabilization  parameter  8k  os  in  (7.15). 

Then  there  holds  the  error  estimate 


-«SDlliD<C£  (^) 


hK^SK+ 1  $(kK,SK )  ,,|2 

kK 


(7.17) 


where 


and 


NIIId  ==  W^Wl  +  |H|2  +  i  ||u||g,r+  +  ||«||L 


0  5:  sk  <  k  Vif  £  T,  u  =  u  o  Fp  if  FT  6  7p 


and  #(fc,s)  is  as  in  (5.10). 

Let  us  discuss  special  cases  of  the  above,  general  error  bounds. 


Remark  7.8. 

1)  If  kx  =  k  is  fixed,  and  hx  =  h  0,  (7.16)  is  optimal  in  h. 

2)  As  s  is  fixed,  kx  =  k  -»  oo,  Stirling’s  formula  implies 

0(k,s)  <  C(s)  fc“2s 

and 

lllu-ucollltoSCE  . 

K 

The  bound  (7.16)  is  optimal  also  in  k,  improving  upon  [12],  [13]. 

3)  If  u  is  patchwise  analytic,  we  have  the  bounds 

VK  G  T  3dK  >  1,  C  >  0  Vs  >  0  :  |w|sif  <  C{dK)ss\ 


(7.18) 
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In  this  case,  we  get  the  exponential  convergence  estimate  [36] 

lll«  -  “dgIIIdg  ^  c  Yj  (^f)  ( kK )2  exp(-2 bKkK)  . 

k  z 

By  Theorem  7.7  an  analogous  bound  holds  also  for  the  hp-SDFEM  on  quadri¬ 
lateral,  possibly  1-irregular  meshes. 

8  Convection-Diffusion 

Based  on  the  discretizations  of  the  diffusion  operator  in  Section  6  and  of  the 
advection  problem  in  Section  7,  it  is  now  easy  to  derive  discretizations  of  the 
convection-diffusion  problem 

Ceu  =  -eAu  +  a.(x)  ■  Vu  +  b(x)u  =  S  in  Q ,  (8.1) 

ii  =  0  on  dfi .  (8.2) 

Here  the  viscosity  e  €  (0, 1],  /  €  L?(Q)  and  a(z),  b(x)  are  assumed  to  belong 
to  C1(f2)  and  to  satisfy  (7.3). 

8.1  Standard  Galerkin  discretization 

The  standard  Galerkin  discretization  of  (8.1)  reads: 
find  u  €  Hq(H)  such  that 

Be(u,v )  :=  e(Vu,Vv)n 
+(a  •  Vu  +  bu,v)o  =  {S,v)n  toeHo(.f?). 

The  Galerkin  finite  element  discretization  of  (8.3)  reads: 
find  ufe  €  T)  such  that 

Be(uFE,v)  =  (S,v)n  to  G  Sj’1  (f2, T)  . 

Condition  (7.3)  guarantees  the  solvability  of  (8.3),  (8.4),  since,  for  u  £  Sq’1  (12,  T) 
it  holds 

Be(u,u)  =  e  ||Vu||2  +  /  (a  ■  Vu  +  bu)  udx 
Jn 

=  £  ||Vu]|2  +  f  (b  -  i  V  ■  a)  |u|2  dx 

J  £2 

>  e||Vit||2  T  cq  ||u| 

>  min(l,c0)||u||2 


(8.3) 


(8.4) 


|2 


(8.5) 
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where  we  used  the  formula 

/  u a  •  Vu  dx  =  —  I  u2  V  •  a  —  /  ua  ■  Vu  +  /  u2  a  •  n  ds  . 

Jn  Jn  Jq  Jon 

We  see  that  (8.4)  is  stable  in  the  ||  o  ||£-norm,  whence  it  follows  that 

||u  —  ufeIIs  <  C  ||u  —  u||£  Vu  E  .  (8.6) 

The  Galerkin  FEM  (8.4)  without  stabilization  converges  therefore  optimally, 
provided  the  FE-space  Sq’1  (17,  T)  resolves  the  fine  scales  of  the  solution  (such 
as  boundary  layers,  eddies,  fronts  etc.).  If  this  is  not  the  case,  the  Galerkin 
FEM  (8.4)  is  prone  to  pollution,  i.e.  a  local  underresolution  of  fine  solution 
scales  triggers  oscillations  which  spread  throughout  the  domain  17. 

To  prevent  this,  stabilized  schemes  must  be  used  (in  fact,  the  main  impetus 
for  the  development  of  stabilized  methods  has  come  from  the  inability  of  the 
FEM  to  resolve  all  small  scales  of  the  flow).  We  present  here  two  stabilization 
techniques,  the  hp-SDFEM  and  the  hp-DGFEM. 


8.2  Streamline-Diffusion  FEM 

Formulation  and  main  properties.  The  hp-SDFEM  discretization  of 
(8.3)  reads:  find  mfe  €  SQ,1(f2,7~)  such  that 

Bsd(ufe,v)  =  Fsd(v)  Vv€S^(n,T).  (8.7) 


Here  the  bilinear  form  and  the  right  hand  side  include  the  so-called  stabiliza¬ 
tion  terms:  for  u  E  T),  we  have  with  Ce  as  in  (8.1) 


Bsv(u,v)  :=  Be(u,v)  +  5K 
K€T 


[  {C£u){C0v) 
Jk 


dx , 


(8.8) 


Fsd{v)  :=  ( S,v)a  +  y '  Sk 
KeT 

Remark  8.1.  At  first  sight,  it  would  appear  that  the  stabilization  terms  in 
(8.8),  (8.9)  require,  for  positive  e,  fl'2(A')-regularity  of  u.  This  is  not  so  -  all 
that  is  required  for  Bsd,  Fsd  to  make  sense  is  that  C£u  E  L2(K),  and  this  is 
satisfied  for  the  exact  solution  if  S  in  (8.1)  belongs  to  L2(K)  for  all  K  eT  ■ 

Remark  8.2.  The  SDFEM  formulation  is  fully  consistent,  i.e.  for  any  value 
of  e  and  5,  the  exact  solution  of  (8.1)  satisfies  (8.7).  Adding  the  stabilization 
terms  on  the  right  hand  sides  of  (8.8),  (8.9)  therefore  does  not  alter  the 
problem  to  be  discretized. 
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Remark  8.3.  As  in  the  pure  convection  case,  the  SDFEM  contains  free  pa¬ 
rameters  8k  at  our  disposal;  for  5k  =  0,  (8.7)  reduces  to  (8.4),  8k  >  0  will 
imply  stabilizations.  5k  needs  to  be  selected  in  dependence  on  kK  as  well  as 
on  the  element  shape  -  this  will  be  explained  below.  Proper  choice  of  8k  is 
crucial  for  good  performance. 

Remark  8. 4-  We  see  that  for  e  =  0  the  SDFEM  (8.7)  becomes  (7.2).  All 
properties  are  shown  below  for  the  SDFEM. 


Fig.  8.1.  L2  and  energy  performance  of  “two-element  mesh”  for  Galerkin  FEM  and 
SDFEM,  s  =  10-8; 


Stability.  As  we  pointed  out  in  Section  5.3,  the  solution  of  (8.1)  exhibits 
for  small  e  >  0,  boundary  layers  and  hp-FEM  will  not  give  exponential 
convergence  uniform  in  e  if  unstructured,  shape  regular  meshes  are  used 
(no  layers  are  present  for  e  —  0,  i.e.  for  the  pure  convection  problem).  We 
therefore  address  now  the  choice  of  the  parameters  5k  in  (8.8),  (8.9)  and  the 
stability  of  the  method.  We  assume  that  the  mesh  T  in  (8.7)  is  given  in  terms 
of  patches  P  €  V,  regular  patch  maps  FP  :  K  -4  P  and  allow  patch  meshes 
Tp  with  anisotropic  quadrilateral  elements,  of  the  type  introduced  in  Section 
5.3.  Then  there  holds: 
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Fig.  8.2.  L 2  and  energy  performance  of  p  version  on  uniform  mesh  with  h  =  0.5 
for  Galerkin  FEM  and  SDFEM,  e  =  10~4 


Theorem  8.5.  Let  the  mesh  T  consist  of  shape  regular  triangles  of  diameter 
hpc  or  of  possibility  anisotropic  quadrilaterals  with  sidelengths  hK, max  and 
ha, min,  respectively  (no  bound  on  the  aspect  ratio  hK,max/hK, min  is  assumed). 
Then  there  exists  Sq  >  0  independent  of  ha,  k  and  of  the  aspect  ratio,  such 
that  for  all  0  <  5  <  6q  the  choice 


8  Lk  ,max  hj£  ,min 

\Jh2K,max  +  ^K,mi 


(8.10) 


will  render  the  hp-SDFEM  (8.7)  stable  independent  of  the  aspect  ratio,  i.e.  it 
holds 

\  IMIId  <  SsdM  v«  e  S^\n,T)  (8.11) 

where  the  norm  ||  o  ||gD  is  defined  by 

Nil*  :=  £  I|V«||^  +  iH|gifl  +  E  6k\\£ou\\Ik  . 

KeT 


For  the  proof,  we  refer  to  [28]. 
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Fig.  8.3.  L2  and  energy  performance  of  p  version  on  uniform  mesh  with  h  =  0.5  + 
small  element  size  £,  e  =  10“ 4 


Note  that  for  shape-regular  elements  tiK,m ax  =  min  =  h k  and  (8.10) 
becomes  simply  £  n2 


5k  =  8  hK/^K  , 


(8.12) 


which  should  be  compared  with  the  choice  (7.15):  we  see  that  the  appearance 
of  the  viscous  terms  changes  the  weight  5  k  from  hx/fc/r  to  /i/c  / k2K ,  at  least 
as  far  as  the  stability  analysis  is  concerned. 

Remark  8.6.  The  previous  Theorem  applies  in  particular  also  to  the  (k,  u)- 
geometric  boundary  layer  meshes  shown  in  Figures  8  and  9. 

We  shall  see  in  Section  10  below  how  stabilized  formulations  like  (8.8), 
(8.9)  can  also  be  used  in  the  computation  of  incompressible  fluid  flow. 


Computational  Experiments.  In  this  section,  we  illustrate  the  perfor¬ 
mance  of  the  hp-SDFEM  (8.7)  with  numerical  examples  for  1-d  convection- 
dominated  problems.  All  findings  which  we  report  below  are  mathematically 
explained  in  detail  in  [47],  Our  aims  in  these  numerical  experiments  are 

1.  to  illustrate  the  theoretical  results  obtained  above,  in  particular  the  abil¬ 
ity  of  the  hp-FEM  to  resolve  very  narrow  fronts  and  layers,  leading  to 
the  asymptotic  exponential  convergence  with  few  degrees  of  freedom; 
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Fig.  8.4.  L2  and  H1  performance  on  the  first  element  of  p  version  on  uniform  mesh 
with  h  =  0.5,  £  =  10~4 


2.  to  compare  /ip-SDFEM  and  hp-Galerkin  FEM  in  the  preasymptotic  phase, 
i.e.,  if  the  small  scales  of  the  solution  are  not  resoled.  In  particular,  we 
will  see  that  the  appropriate  choice  of  mesh  sequences  lead  to  robust 
exponential  convergence  on  compact  subsets  for  the  hp-SDFEM. 

We  consider  two  types  of  problems,  a  standard  advection-dominated  problem 
and  a  turning  point  problem  which  satisfies  the  crucial  assumption  (7.3). 

The  boundary  layer  case.  Let  us  first  consider  the  problem 

— eu"  +  au'  =  ewx,  u(±l)  =  0,  u  =  1,  a  =  1  (8.13) 

The  exact  solution  has  a  boundary  layer  at  the  outflow  boundary  x 
is  given  by 

uz  .=  - -  +  -  +ce~a^-x^e 

u(a  —  we)  a 


=  1  and 
(8.14) 


ae 


1  -  e2we~2o/£ 


c  = 


cj(ue  —  a)  1  —  e-2o/E 
e~“(e2u  -  1) 


w(we  —  a)(l  —  e-2o/£) 


0(1). 


(8.15) 
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Fig.  8.5.  L2  and  H1  performance  on  the  first  element  of  p  version  on  uniform  mesh 
with  h  =  0.5  +  small  element  size  e,  e=  10“4 


Note  that  both  ||«e||i,2(r?)  and  |||u£|||  are  0(1)  independently  of  e. 

Global  SDFEM  performance.  We  present  numerical  results  for  the  SD- 
FEM.  In  order  to  illustrate  the  robustness  of  the  SDFEM  with  respect  to 
the  weights  (<5;)£Li  noted  in  Section  8.2  we  choose  the  weights  (^)^=1  of  the 
SDFEM  as 

5_\\hi  if  ek2/hi  <  \ 

[  0  otherwise  . 

We  point  out  that  numerical  results  are  practically  identical  if  the  choice  is 
made. 

In  our  first  series  of  numerical  experiments,  we  resolve  the  boundary  layer 
with  the  two-element  mesh  of  (5.21)  with  n  =  1.  Fig.  8.1  compares  the 
behavior  of  the  Galerkin  and  the  SDFEM  in  the  L2  norm  and  the  energy 
norm  |||-|||  (which  is  \fe\  •  l#1^))  for  e  =  10~8  where  the  order  k  ranges 
from  1  to  27.  The  theory  of  [47]  yields  robust  exponential  convergence  in  the 
energy  norm  for  the  SDFEM  as  well  as  the  Galerkin  FEM  on  this  two  element 
mesh.  This  exponential  convergence  is  visible  in  the  bottom  figure  of  Fig.  8.1. 
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Fig.  8.6.  L2  and  energy  performance  for  “hp”-mesh;  SDFEM;  a  =  0.5,  e  =  10  8 


Furthermore,  for  the  SDFEM,  we  have  robust  exponential  convergence  in 
L°°  and  thus  in  L2  (cf.  the  top  figure  of  Fig.  8.1);  we  also  observe  robust 
exponential  convergence  in  L 2  for  the  standard  Galerkin  FEM,  Fig.  8.1.  We 
note  that  the  qualitative  behavior  of  the  schemes  is  comparable  although  the 
error  of  the  hp-SDFEM  is  slightly  smaller  than  that  of  the  Galerkin  FEM  for 
this  problem. 

We  conclude  that  the  two-element  mesh  scheme  is  able  to  resolve  the 
boundary  layer  at  the  outflow  boundary  and  that  no  stabilization  is  required 
in  this  case. 

Our  next  experiment  is  geared  towards  getting  insight  in  the  behavior  of 
the  Galerkin  method  and  the  SDFEM  if  the  boundary  layer  has  not  been 
resolved.  To  that  end,  we  consider  the  performance  of  the  p  version  on  a 
uniform  mesh  with  h  =  0.5  (i.e.,  4  elements).  Here,  the  order  k  ranges  from 
1  to  27  and  e  =  10”4.  Fig.  8.2  shows  the  behavior  in  the  L2  and  the  energy 
norm  |||-|||.  The  error  in  the  hp-SDFEM  is  considerably  smaller  than  that 
of  the  Galerkin  method,  but  the  rate  of  convergence  SDFEM  is  very  poor 
also — in  the  energy  norm,  no  convergence  can  be  observed! 

Finally,  Fig.  8.3  shows  the  performance  of  a  uniform  mesh  (h  =  0.5)  aug¬ 
mented  by  one  small  element  of  size  e  in  the  outflow  boundary  layer  (i.e.,  the 
mesh  given  by  the  nodes  {— 1,  -0.5, 0, 0.5, 1-e,  1}).  As  to  be  expected,  insert- 
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Fig.  8.7.  L2  and  H1  performance  on  first  element  (—1,0)  for  “hp”-mesh;  SDFEM; 
a  =  0.5,  e  =  10~8 


ing  one  small  element  of  size  e  greatly  alleviates  the  problems  of  the  standard 
Galerkin  method  (cf.  [47]  for  a  detailed  analysis).  Comparing  Fig.  8.2  with 
Fig.  8.3,  the  error  of  the  Galerkin  FEM  is  reduced  by  two  orders  of  magnitude. 
Nevertheless,  both  the  Galerkin  method  and  the  SDFEM  yield  poor  rates  of 
convergence  as  the  p  version  on  a  mesh  with  one  small  element  of  size  e  near 
the  boundary  cannot  resolve  the  boundary  layer  properly.  Hence,  comparing 
the  results  with  those  in  Fig.  8.1,  we  see  that  the  proper  element  length  ek 
at  the  boundary  is  essential  for  the  boundary  layer  (compare  Theorem  5.15) 
resolutions  as  well  as  for  robust  exponential  convergence. 


Local  p-SDFEM  performance  —  pollution.  We  have  just  seen  that  the 
pure  p  version  Galerkin  FEM  and  SDFEM  have  poor  convergence  properties 
if  the  error  is  measured  in  a  global  norm  such  as  the  L2  or  the  |||-|||  norm.  The 
performance  was  not  substantially  improved  by  inserting  one  small  element 
of  size  e  in  the  layer.  The  local  behavior  of  the  pure  p- version  SDFEM  is 
investigated  in  Figs.  8.4,  8.5  by  plotting  the  relative  L2  and  H1  errors  in  the 
first  element  I\  =  (—1,0)  for  a  uniform  mesh  with  h  =  0.5  and  a  uniform 
mesh  with  h  =  0.5  that  is  augmented  by  one  small  element  of  size  s  in  the 
layer.  Although  the  SDFEM,  which  suppresses  spurious  oscillations,  is  much 
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Fig.  8.8.  L2  and  energy  error  for  turning  pt.  problem,  a  =  1,  e  =  10  8,  3  elem. 

more  accurate  (1%  in  both  L 2  and  Hl  on  (-1,  -0.5))  than  the  Galerkin  FEM, 
we  see  that  increasing  the  order  k  does  not  reduce  the  error.  We  conclude 
that  the  pure  p-  version  of  both  the  Galerkin  FEM  and  the  SDFEM  are  prone 
to  pollution,  i.e.,  the  error  introduced  by  not  resolving  the  boundary  layer 
affects  strongly  the  accuracy  achievable  in  the  whole  computational  domain. 

Local  SDFEM  performance  on  special  mesh  sequences.  Our  next 
numerical  example  shows  that  the  hp-SDFEM  leads  to  robust  exponential 
convergence  on  compact  subsets  not  containing  the  layers  if  an  increase  of 
the  polynomial  degree  is  combined  with  a  mesh  refinement  towards  the  layer. 
We  therefore  consider  the  following  scheme:  For  a  grading  factor  a  €  (0, 1) 
let 

ko  €  IN  be  the  smallest  integer  s.t.  ak°  <  ko£ 

and  let  for  each  polynomial  degree  k  a  geometrically  refined  mesh  with  p 
layers  be  given  by  the  points 

{-1, 1, 1  —  a1 1  i  =  0, . . .  , min  ( k ,  fc0)}.  (8.16) 

On  such  meshes,  we  will  consider  as  trial  spaces  the  space  Sq’1(T)  (cf. 
Fig.  8.10).  We  note  that  such  mesh  sequences  would  typically  be  generated 
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Pig.  8.9.  L2  and  energy  error  for  turning  pt.  problem,  a  =  —1,  e  =  10  8,  4  elem. 

by  adaptive  schemes  that  locate  and  try  to  resolve  the  layers.  It  can  be  shown 
using  ideas  of  [40,75]  (cf.  [47]  for  the  details)  that  the  hp-SDFEM  converges 
robustly  and  exponentially  on  compact  subsets  of  f?  for  such  mesh  sequences: 

Theorem  8.7.  Let  a  =  1,  b  =  0,  a  €  (0, 1),  £  £  (—1, 1)  be  fixed.  For  k  £  BM 
consider  the  meshes  T  defined  by  the  nodes  (8.16).  Assume  that  the  weights 
{$i)iLi  are  of  the  form  (8.12).  Then  there  are  constants  C,  b  >  0  independent 
of  e,  k  such  that 

||ue  -  usc||ifi(-i,c  <  Ce~bk,  k  =  1, 2, . . . 

The  refinement  factor  a  is  chosen  in  the  following  experiments  as  a  =  1/2 
and  the  weights  (Si)ff=1  are  given  in  both  cases  by 

if  e<7lf 

Si  =  4  k  4  k  (8.17) 

(  0  otherwise  . 

Again,  we  point  out  that  choosing  the  weights  (S)fLl  as  in  (8.10),  (8.12)) 
leads  to  similar  numerical  results.  For  e  =  10-8  and  k  going  from  1  to  22. 
Figs.  8.6-  8.7  show  the  performance  of  the  SDFEM  in  comparison  with  the 
Galerkin  FEM.  Fig.  8.6  depicts  their  behavior  in  global  norms  (L2  and  energy 
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norm)  whereas  Fig.  8.7  shows  the  relative  error  (measured  in  the  L 2  and  H1 
norm)  in  the  first  element  Ii  =  (—1,0).  Fig.  8.6  illustrates  once  more  that 
both  Galerkin  FEM  and  fip-SDFEM  do  not  lead  to  convergence  in  the  energy 
norm  until  the  layer  is  resolved,  that  is,  ak  rs  ek  (for  a  =  0.5  and  e  =  10~8 
this  happens  for  k  rs  22).  The  behavior  of  the  Galerkin  FEM  is,  however, 
completely  different  from  that  of  the  fip-SDFEM  if  the  error  on  the  first 
element  I\  =  (—1,0)  is  of  interest  (cf.  Fig.  8.7).  The  Galerkin  FEM  is  highly 
prone  to  pollution'.  The  local  error  in  Ji  cannot  be  controlled  until  k  is  so 
large  that  the  smallest  element  in  the  layer  has  width  crk  rs  ke.  In  contrast  to 
this,  the  SDFEM  is  pollution-free  as  robust  exponential  convergence  on  the 
compact  subset  (—1,0)  can  be  achieved  according  to  Theorem  8.7  and  in  fact 
is  visible  in  Fig.  8.7. 


Turning  point  problems.  Let  us  now  consider  a  problem  with  a  turning 
point  at  x  =  0.  We  consider 

—eu”  +  axu'e  +ue  =  1,  on  (-1, 1),  a  =  ±1  (8.18) 

us(±  1)  =  0  (8.19) 

In  the  case  a  —  1,  the  exact  solution  has  boundary  layers  at  both  endpoints 
±1;  for  a  =  -1,  the  exact  solution  exhibits  an  internal  layer  at  the  turning 
points  x  =  0.  The  exact  solutions  are  given  by 

ue{x)  =  1  —  exp  {(a;2  —  l)/(2e)}  for  a  =  1  (8.20) 


ue(x )  =  1  —  ca;erf  ^a;/\/2e)  -  \/2/7rci/eexp  {-z2/(2e)} 
for  a  =  —  1 

c  :=  (erf(l/-\/2e)  +  \/2£/7r  exp  (-l/(2e)))_1  m  1 
for  small  e 

2  fx 

erf  (a;)  :=  —=  /  exp  (— f2)  dt,  erf  (x)  -4  1  for  x  -4  oo 
V7r  Jo 


(8.21) 


Equation  (8.18)  satisfies  the  crucial  assumption  (7.3)  and  the  fact  that 
the  coefficient  a  is  a  polynomial  allows  us  to  modify  the  arguments  as  to 
accommodate  the  case  of  (8.18)  as  well.  For  the  SDFEM  we  use  the  weights 
(8.10),  i.e. 

x  1  hi 
Si=i¥- 


The  solution  given  by  (8.20)  (i.e.,  the  case  a  =  1)  has  two  boundary  layers  at 
both  endpoints  with  length  scale  0(e).  The  structure  of  the  boundary  layers 


394  Christoph  Schwab 


is  essentially  of  the  form  analyzed  in  Section  3.2  so  that  the  approximation 
results  with  the  “two-element”  meshes  introduced  apply.  In  fact,  a  “three- 
element”  mesh  consisting  of  two  small  elements  of  size  ke  at  the  boundary 
points  and  one  large  element  in  the  middle  (that  is,  the  mesh  is  given  by  the 
points  {— 1,  —  1  +  ke,  1  —  fee,  1})  is  well-suited  to  resolve  the  layers  in  both  the 
Galerkin  as  well  as  the  SDFEM  (cf.  Figs.  8.8  where  e  =  10-8). 

In  the  case  a  =  — 1,  the  solution  is  given  by  (8.21)  and  has  an  internal 
layer  of  width  0(y/e).  Again,  the  “two-element”  mesh  in  Theorem  5.14  can  be 
applied  successfully  for  the  approximation  of  the  internal  layer  if  at  least  one 
element  of  size  0(k^/s)  is  introduced  at  the  turning  point  x  =  0.  Figs.  8.9 
show  the  performance  of  the  Galerkin  FEM  and  the  SDFEM  for  a  “four- 
element”  mesh  based  on  the  points  {—1,  —k^/e,  0,  ky/e,  1}  and  e  =  10-8.  Al¬ 
though  the  error  graphs  do  not  behave  monotonically,  the  overall  convergence 
of  the  “four-element”  hp-SDFEM  shows  exponential  convergence  rates. 


Conclusions  on  hp-SDFEM  for  convection-diffusion.  From  our  nu¬ 
merical  experiments  we  conclude  that  some  mesh  refinement  in  the  layer  is 
indispensable  for  proper  performance  (in  a  global  norm)  of  both,  hp  Galerkin 
FEM  and  hp  SDFEM;  in  this  case,  both  methods  perform  comparably  well. 
If,  however,  the  length  scales  of  the  solution  are  not  completely  resolved,  the 
hp-SDFEM  is  considerably  more  robust  than  the  Galerkin  FEM  in  the  sense 
that  it  effectively  suppresses  spurious  oscillations  in  the  pre-asymptotic  range 
of  convergence,  and  that  its  asymptotic  convergence  rate  is  very  close  to  that 
of  the  best  approximation. 

A  successful  strategy  in  more  complicated  settings  will  therefore  combine 
mesh  adaptation  at  low  p  with  SDFEM  stabilization  in  the  preasymptotic 
range  in  order  to  locate  the  layers/fronts.  Once  the  layers/fronts  are  located, 
our  mesh  design  principles  based  on  the  “two-element”  mesh  can  be  success¬ 
fully  applied  to  resolve  the  layers. 

In  this  pre-asymptotic  range,  when  the  layers/fronts  are  still  to  be  located 
by  some  adaptive  scheme  the  hp- SDFEM  leads  already  to  robust  exponential 
convergence  on  compact  subsets  “upstream” .  The  pure  Galerkin  FEM  on  the 
other  hand  does  not  produce  reliable  results  anywhere  in  the  computational 
domain  until  the  layer  is  resolved. 

We  emphasize  that  we  investigated  here  only  one-dimensional  linear  prob¬ 
lems  where  very  precise  regularity  properties  of  the  solution  are  available.  The 
stability  analysis  of  the  hp-SDFEM,  however,  did  not  exploit  these  proper¬ 
ties  so  that  similar  findings  will  likely  hold  also  in  two-  and  three-dimensional 
situations.  The  main  conclusion  which  can  be  drawn  from  the  numerical  ex¬ 
periments  is  that  localized  small  scale  phenomena  of  viscous  flow  can  be 
resolved  with  moderate  computational  by  hp- FE  discretizations  and  that  the 
/ip-SDFEM  can  perform  very  satisfactorily  in  an  adaptive  environment. 
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Fig.  8.10.  Geometric  sequence  of  meshes  generated  by  successively  halving  the 
rightmost  element 


9  Elasticity 

We  have  concluded  now  the  presentation  of  the  basic  hp- FE  discretization 
techniques  for  the  scalar  model  problem  (2.1)  and  turn  to  the  system  of 
Navier-Stokes  equations  (1.1)  -  (1.3).  Equation  (1.1)  is  hyperbolic  and  of 
the  convection  type  considered  in  Section  7,  whereas  (1.2)  and  (1.3)  are  of 
nonlinear  convection-diffusion  type.  In  particular,  (1.3)  is  a  scalar,  nonlinear 
convection-diffusion  problem  of  the  type  treated  in  Section  8.  Here,  we  focus 
on  the  momentum  equation  (1.2)  which  we  approach  from  the  “elliptic”  right 
hand  side. 


9.1  Basic  equations 

In  the  absence  of  advection  and  transient  effects,  (1.2)  reads 


d 

dxj  Tij 


=  Si  in  J? 


(9.1) 


(here  and  in  what  follows,  indices  run  in  the  set  {1, ...,d}  and  Einstein’s 
summation  convention  is  used). 

The  stresses  ry  must  be  related  to  the  velocity  field  ui  by  a  constitutive 
law.  In  a  Newtonian  fluid,  the  stresses  depend  on  the  symmetric  velocity 
gradient 


Diflu) 


1  f  dui  duj  \ 

2  \dxj  dx  i ) 


1  <  i,j  <  d  . 


(9.2) 


If  the  medium  is  homogeneous  and  isotropic,  r  and  D  are  related  by  Hooke ’s 
law,  i.e. 


nj  (u)  =  q(div  u)<Jij  +  2p  Djj  (u) 


(9.3) 
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where  7  and  /1  are  the  so-called  Lame- coefficients.  Experimentally,  we  always 
have 

fi>0,  7  +  2/i/d  >0  in  IRd.  (9.4) 

For  most  Newtonian  fluids  and  gases,  experimental  evidence  indicates  that 
7  +  2yu/3  is  very  small,  it  is  therefore  set  to  zero  for  many  common  fluids.  We 
shall  see  in  Remark  9.2  below  that  this  may  be  problematic. 

If  -y  =  =  0,  we  see  from  (9.3)  that  the  equation  (1.2)  becomes  inviscid, 

since  r  =  0.  In  the  compressible  Navier-Stokes  case  we  will  assume  here 

/u  >  0,  7  +  2 p/d  >  0  .  (9.5) 

Then  the  right  hand  side  of  (1.2)  will  be  dissipative  as  well,  see  Proposition 

9.1  below. 

9.2  Variational  formulation 

The  variational  formulation  of  (9.1)  is  obtained  by  integration  by  parts  with 
the  Green  formula 

-(V>divr(u))o  =  2(/iD(U),D(v))n  ^  ^ 

+  (7  div  u,  div  v)n  -  (v,  r (u)n)9o  . 

We  impose  Dirichlet  (“no-slip”)  boundary  conditions 

u  =  0  on  I'd  (9.7) 

and  Neumann  ( “flux” )  boundary  conditions 

r(u)n  =  g  on  T)v  •  (9.8) 

Using  (9.1)  and  (9.8)  in  (9.6),  we  get  the  standard  variational  formulation: 
find  u  G  H]D(!7)d  such  that 

E{ u,v)  =  F(v)  Vu  G  HlD(I7)d  (9.9) 

with  the  elasticity  bilinear  form  given  by 

E( u,v)  :=  2(/rD(u),  D(v))«  +  (7  div  u,  div  v)^ 


and 

f(v)  :=(S,v)fl  +  (v,g)an. 

The  term  div  r(u)  on  the  right  hand  side  of  (1.2)  is  dissipative,  since  there 
holds 

Proposition  9.1.  Let  J?  C  IR d  be  a  bounded  Lipschitz  domain  and  assume 
that  rD  C  dfi  satisfies  fr^  ds  >  0.  Assume  further  that  7,  p  are  constant  in 
17.  The  right  hand  side  of  (1.2)  is  dissipative,  if  and  only  if  (9.5)  holds. 
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Proof.  If  JFd  ds  >  0,  there  exists  C(f2,  Td)  >  0  such  that  Korn’s  inequality 
holds 

Vu  €H1(n,rD)d  :  ||D(u)||La(1J)  >  C  ||u||fl.{n)  .  (9.10) 

Writing  a  =  \  trace  (D(u))  and  defining  the  deviatoric  part  D0(u)  :=  D(tt)  — 
a  1,  we  get 

E( u,u)  =7l|divu||^  +  2/i(D(u),  D(u))fi 
=  2p\\D0(v.)\\2n  +  (2  p  +  7  d)  d\\afn 

where  we  used  that 

l|D(u)||i2w  =  (D(u),  D(u))fl  =  ||D0(u)||2fi  +  d||a||2fi  . 

Hence  we  get 

£(u,u)  >  min(2/i,2/i  +  7d)||D(u)||^  . 

Korn’s  inequality  and  (9.5)  imply  the  assertion.  □ 

Remark  9.2.  In  the  case  of  a  monatomic  gas,  p  >  0  and 

7  +  2  p/d  =  0 

(Stokes’  relation).  The  dissipativity  of  div  r(u)  in  (1.2)  is  then  not  clear. 

The  FE  discretizations  of  (9.9)  are  analogous  to  those  of  Section  6  and 
we  present  them  here. 

9.3  Standard  continuous  discretization 

It  reads:  find  ufe  G  Sp1(n,T,F-p)d  such  that 

£(ufe,v)  =  F(v)  Vu  G  S)f\n,T,  FP)d  .  (9.11) 

9.4  Dual  mixed  formulation 

Again,  to  accommodate  discontinuous  velocities  ufe,  a  mixed  formulation 
is  useful.  Now  the  flux  is  simply  the  stress  tensor  r(u).  The  mixed  form  of 
(9.1),  (9.7),  (9.8)  reads 

— divr  =  S  in  i? , 
r  =  CD(u)  in  1? , 
u  =  0  on  Tp  , 

r(u)n  =  g  in  Tn  , 


(9.12) 
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and  in  weak  form:  find  r  €  H(div,  Q),  u  G  L2{Q)d  such  that,  assuming  (9.5) 
holds, 


(C  1T,a-)n  +  (u,divr)fi  =  0  Vcr  G  H(div,  J?) , 
(v,divcr)tf  =  (S,v)n  +  (g ,v)r„  Vv  G  L2{Q)d  . 


Here  H(div,  J 1)  {r  G  L2(fi)dy£  :  divr  G  L2(S l)d}  and  C  1  is  the  inverse 

of  the  elasticity  compliance  tensor. 

The  construction  of  finite  elements  which  are  H(div,  Q)  conforming  and 
stable  for  (9.13)  is  delicate.  Some  2— d  examples  can  be  found  in  [16],  Chapter 
VII.2.  Note  that  in  IRd  in  the  mixed  formulation  (9.13)  d(d  +  l)/2  additional 
fields  have  to  be  discretized;  for  elasticity  problems  with  discontinuous  ufe, 
the  incentive  to  consider  mortar  resp.  DG-FEM  is  therefore  even  higher  than 
for  scalar  advection-diffusion  problems.  Moreover,  the  discretization  tech¬ 
niques  for  the  scalar  case  carry  over  to  large  extent.  We  therefore  do  not 
recommend  discrete  versions  of  (9.13)  in  fluid  flow  simulations,  and  turn  to 
the  mortar  and  DG  methods. 


9.5  Mortar  Discretization 

Basic  Discretization.  We  use  the  notation  of  Section  6.3  and  proceeding 
analogously  we  arrive  at:  find  u  G  Hp(f2,T)d,  M  G  Md  such  that  Vv  G 
H1(J?,T)d,  VA  G  Md 


Er(u,\)  +  <2>t(v,m)  =  (S,v)n  +  (g,v)rN 

$r(u> A)  =  0  . 

Here  the  broken  bilinear  forms  are  given  by 


(9.14) 


Et( u,  v)  :=  JJEk(u,v)  =  (CD(u),D(v))x , 

K  K 

$r{ u,A):=  ([u],A)e. 

c€«Sint 


The  discretization  of  (9.14)  is  analogous  to  (6.22):  find  ufe  G  S^°(f2,T)d 
and  /xFE  G  Mk'°{n,T)d  such  that  Vv  G  S^°(Q,T)d,  VA  G  Mk’°(Q,T)d 

ET{ uFe,v)  +  #t(v,m)  =  (S, v)fi  +  (g,v)rN  . 

,  v  (9.15) 

$r(uFE,A)  =0  . 


The  structure  of  the  linear  system  corresponding  to  (9.15)  is  analogous  (6.9). 
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Implementation  without  fluxes.  Assume  that  the  polyhedron  ft  C  IRd  is 
partitioned  into  a  regular  triangulation  T  of  simplicial  elements  K.  We  con¬ 
sider  (9.1),  (9.7)  and  (9.8)  and  assume  the  polynomial  degrees  k  are  uniform 
and  equal  k  >  2.  Let  <Sint  denote  the  set  of  all  d  —  1  dimensional  simplices 
ex k,  which  are  interfaces  of  K,  K'  £  T.  Then  we  have,  analogous  to  (6.25), 


(4-V  =  {ue4’°(/?,r)d: 

f  [u]-<pds  =  0  V<p  £  Vk~l{eKKI)d\  . 

JeKK'  1 


(9.16) 


The  matrix  A  is  the  stiffness  matrix  of  the  form  Eq-{n,  v)  on  (Sp5)d  x  (Sp5)d. 
And  it  holds  that  A  is  symmetric,  positive  definite,  if  k  >  2  and  if  (9.5)  holds. 

To  see  it,  assume  that  0  =  Er(n,  u)  for  some  0  /  u  G  (Sjf)d.  Hence  we 
get 

0  =  Et(u,  u)  >  min{2/x,  2/x  +  •yd)  ||D(u)||^- 

xer 

which  implies  that  u\k  is  a  rigid  body  motion,  i.e. 


VA  €  T  3A K  =  -A J,  hK  :  u|#  =  AKx  +  b*-  . 

If  K  n  Tb  5^  0,  then  u|k  =  0  in  these  K. 

Further,  u  =  0  in  the  remaining  elements  K  €  T,  since 


Vex#'  €  Sint 


■I  i 

JeKK' 


[u]  •  ip  ds  =  0 


for  every  ip  £  Pi(eKK')d,  due  to  k  >  2,  and  since  [u]  is  linear  on  exx'  ■ 
Therefore  Er(u,u)  —  0  =>■  u  =  0  and  A  is  positive  definite,  hence  this 
discretization  of  viscous  stresses  is  dissipative.  Notice  that  in  the  scalar  case 
in  Section  6.3  the  above  argument  works  even  for  k  >  1,  since  the  “rigid  body 
motions”  are  piecewise  constant  then. 

As  in  Section  6.3,  the  hp-MEM  can  be  implemented  without  the  fluxes.  To 
this  end,  we  evaluate  the  broken  bilinear  form  Ej-(u,  v)  on  the  constrained 
space  (SpS)d,  resulting  in  a  symmetric,  positive  definite  matrix  A,  i.e.  giving 
rise  to  a  dissipative  term,  provided  (9.5)  holds. 


9.6  Discontinuous  Galerkin  discretization 

From  (9.15)  it  is  now  straightforward  to  derive  the  DG-discretization  of  (9.1); 
as  for  the  diffusion  problems  in  Section  6.3,  we  replace  in  ,  p)  on  each 
edge  e  £  <Sint  (9.14)  the  multiplier  fi  in  the  saddle  point  form  by  the  flux 
average,  i.e. 


He  =  (r(u)  n),  Aje  =  — (t(v)  n) 


(9.17) 
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where  n  denotes  the  unit  normal  vector  perpendicular  to  the  interface  e, 
resulting  in:  find  ufe  €  T,F-p)d  such  that 


£Dg(ufe,v)  =  (S.v)^  +  (g ,v)rN  Vv  e  SVfi{n,T,  Fv)d , 

where  we  defined 

Edg  (u>  v)  =  Y  ek(u>v) 

KZT 

+  Y  [  {(r(v)n)Eu]  -  M(T(u)n)}ds  . 
eesint  Je 


(9.18) 


(9.19) 


And  we  have  once  more  the  positive  semidefiniteness 


Edg(u,u)  >  0 

and  £?dg(u>u)  =  0  if  and  only  if  u|jf  is  a  rigid  body  motion.  It  is  at  present 
open  if  (9.19)  satisfies  a  discrete  inf-sup  stability  condition. 


10  Incompressibility 

10.1  Basic  Equations 

For  an  incompressible  medium,  7  -t  00  in  (9.3)  thereby  imposing  in  (1.2)  the 
incompressibility  constraint 


div  u  =  0  in  f2 . 


(10.1) 


This  constraint  changes  the  momentum  equation  (1.2)  to 

^  (pvt)  +  Y  =  »Aui  +  Si,  i  =  1, . . . ,  d  (10.2) 

if  p  >  0  is  constant.  The  system  (1.1),  (10.1),  (10.2)  constitutes  the  inhomo¬ 
geneous,  incompressible  NSE. 

If  in  addition  p  =  po  =  const  in  J?,  it  follows  that 

du ' 

Po  +  Po  v  •  (uuj)  +  Vp  =  pAm  +  5»,  i  =  l,...,d 


or,  upon  the  rescaling 


dui 

dt 


v  <-  p/po,  P  t-  p/po,  Si - S , 

Po 


+  V  •  (uitj)  -I -  Vp  =  vAui  +  Si,  i  =  l,...,d. 


(10.3) 


(10.4) 
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We  remark  that  the  energy  equation  is  now  absent  and  that  the  function  p  is 
a  Lagrange  multiplier  for  the  constraint  (10.1).  We  shall  not  dwell  upon  the 
derivation  of  (10.2).  We  note,  however,  that  (10.1)  generally  causes  difficulties 
for  a  FE  discretization  which  will  also  appear  at  large,  but  finite  values  of 
7  in  (9.3).  Stable  FE  discretizations  for  (10.1),  (10.2)  promise  also  robust 
performance  for  (1.2),  and  (9.3)  as  7  ->  00. 

Once  again,  we  focus  on  the  space  discretization  of  (10.1)  -  (10.4).  To  this 
end,  we  consider  the  steady  case  ( d/dt  =  0).  Linearizing  around  u  =  w  with 
div  w  =  0  yields  in  (10.4)  the  O seen- equations 


— vAvl  +  w  •  Vu  +  Vp  =  S  in  Cl , 
V  •  u  =  0  in  Q  . 


If,  in  addition,  w  =  0,  we  get  the  Stokes-equations 

— vAvl  +  Vp  =  S  in  Q , 
V  •  u  =  0  in  Q  . 


(10.5) 


(10.6) 


Both,  (10.5)  and  (10.6),  are  completed  by  no-slip  boundary  conditions 

u  =  0  on  dfi  .  (10-7) 


10.2  Variational  formulation  of  the  Stokes  problem 

Consider  first  (10.6)  and  assume  S  €  L2(0)d.  The  discretization  of  the  in¬ 
compressibility  constraint  (10.1)  can  be  done  in  2  ways: 

a)  incorporation  into  the  space,  i.e.  we  look  for  u  E  Jo  :=  {u  E  Hl(Q)d  : 
V  •  u  =  0  in  I/2(f?)}.  It  is  generally  difficult  to  construct  FE  subspaces 
of  J0, 

b)  enforcement  of  (10.1)  via  Lagrange  multiplier  p: 
find  u  €  ffo(l2)d,  p  €  Lq(S7)  such  that 

KVu,Vv)rt-(p,V-v)„  =  (S,v)„  Vv  £  Hl(Q)d, 

(V-u  ,q)a  =0  '  Vgeig(f2). 

Here  L2(12)  =  {q  €  L2(fl)  :  (q,  1)„  =  0}. 

This  is  now  a  mixed  problem  and  the  FE  discretization  of  (10.7)  can  be 
based  on  the  standard  spaces  Sk,e  of  Section  4. 


10.3  FE-discretization  of  the  Stokes  problem 

Stability.  Let  Vn  C  Hq  (C)d,  M/v  C  Lq{Q)  be  any  pair  of  finite-dimensional 
spaces.  The  Galerkin  discretization  of  (10.6),  (10.7)  reads: 
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find  un  €  Vjv,  Pn  G  Mn  such  that 

v(V uN,  Vv)o  -  (pn,  V  ■  v)fi  =  (S,  v)n  Vv  €  VN  , 
(V-u  N,q)n  =0  WqeMN. 


(10.9) 


In  principle,  we  may  choose  for  Vn,  Mn  the  hp- FE  subspaces  of  Section  4. 
However,  the  pair  Vj v,  Mn  must  satisfy  the  discrete  inf-sup  stability  condi- 
tion 

sup  -g’  t-^1  ”  >  >  0  (10.10) 

o#ueVjv  Ikllo  l|Vu|| 


inf 

O^qeMn 


where  7 n  is  the  inf-sup  (or  stability)  constant.  (10.10)  ensures  stability  of  the 
approximation  and  precludes  in  particular  spurious  pressure  modes.  Natural 
choices  for  V/v,  Mn,  such  as 

Vjv  =  S%’1(f2,T)d,  Mn  =  5o_1,0(f?,T) 
k  >  1,  generally  fail  (10.10):  One  must  choose  (Vm,  Mn)  carefully. 


Divergence  stable  elements  on  shape  regular  meshes.  Let  us  present 
various  choices  of  stable  hp- spaces.  To  this  end  assume  that  all  element  map¬ 
pings  Fk  are  affine  and  that  T  is  shape-regular.  Then  the  spaces  Sk'l(fl,T) 
are  determined  by  the  polynomial  spaces  VV,  Mk  on  the  reference  element 
K 

Sk0'\n,T)  =  {qe  Lftn)  :  qoFK  €  MK},  (10.11) 

50M(/2,T)d  =  {u  €  HoW  :  no FK  e  VK}-  (10.12) 

In  the  following  table  we  list  some  pairs  V k,  Mk  and  the  mathematically 
established  bounds  on  the  inf-sup  constant  7 n  in  (10.10).  We  assume  shape 
regular,  possibly  non-quasiuniform  meshes 


Vk 

Mk 

7 N 

K 

Qk 

Qk- 2 

0(k~d/2) 

Q 

(10.13a) 

Qk 

Vk- 1 

0(1) 

Q 

(10.13b) 

vk 

Vk- 2 

0(k~3) 

T 

(10.13c) 

(10.13a)  and  (10.13b)  are  sharp  and  hold  in  two  and  three  dimensions.  We 
remark  that  the  bound  (10.13c),  proved  in  [58],  [60]  is  suboptimal  and  only 
valid  in  two  dimensions.  If  used  on  a  shape  regular,  possibly  geometric  mesh, 
the  velocity-pressure  combinations  (10.13)  give  in  (10.10)  inf-sup  constants 
7 n  which  are  independent  of  the  element  sizes  hx,  K  £  T. 
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Stable  elements  on  (k,  <x)-boundary  layer  meshes.  The  situation  is 
different  on  geometric,  affine  (re,  <r)-boundary  layer  meshes  (cf.  Section  4.2  and 
Figures  8  and  9)  containing  long  rectangles.  Here  the  combination  Qk  x  Qfc_2 
is  stable  independent  of  the  aspect  ratio  [62],  [55]. 

Theorem  10.1.  Let  fi  C  IR2  be  a  polygon  andT  be  an  affine  (re,  a)  geometric 
boundary  layer  mesh.  Let  in  (10.10)  for  k  =  2, 3, . . . 

Vjv  =  s*’1(n,T)d,  mn  =  sk0'\n,T ) 

with  element  spaces  (10.13).  Then  (10.10)  holds  with  ~/n  >  Ck~lx  ifT  con- 
tains  triangles  and  jn  >  CkmJx  otherwise.  Here  C  >  0  is  independent  of  k 
and  of  the  aspect  ratio  of  the  rectangles  (it  depends  only  on  re  and  a). 

No  divergence  stable,  high  order  and  high  aspect  ratio  triangular  element 
family  is  known  to  date. 

10.4  GLS  stabilized  hp-FEM  for  the  Stokes  problem 

The  divergence  stability  (10.10)  imposed  the  use  of  different  polynomial  or¬ 
ders  for  velocity  pressure  approximations.  Equal  order  spaces  are  not  diver¬ 
gence  stable.  There  is,  however,  a  GLS  (Galerkin  Least  Squares)  approach 
due  to  Hughes  and  Franca  which  allows  a)  to  circumvent  (10.10)  and  b)  to  use 
equal  order  approximations  for  V/v  and  Mat.  We  show  here  an  hp-extension 
of  this  approach.  Select 

VN  =  S*’1(n,T)d,  Mn  =  Sft’1(J2,T)  (10.14) 

with  equal  elemental  polynomial  degrees.  Then  the  hp-GLS  FEM  for  (10.7) 
reads:  find  (ugls,Pgls)  €  Vat  x  Mn  such  that 

Ha(uGLS,PGLs;  v,g)  =  Fa(\,q)  V(v,g)eVA r  x  MN  ,  (10.15) 

where  a  >  0  is  a  parameter  independent  of  k  and  h-x,  and 


Ba (u,  p;  v,  q)  :=  i/(Vu,  Vv)  —  (p,  V  •  v)  —  (V  •  u,  q ) 

h2 

~a  tr  (~uAu  +  vp>  ~vA v  + 

KeT 

Fa(v,q)  ■=  (S,v)  -a  V  (S, -vAw  +  Vq)K  . 
keT 

Notice  that  a  =  0  gives  the  (unstable)  Galerkin-formulation  (10.9)  (just  add 
the  equations  there).  Note  also  that  the  GLS  formulation  (10.15)  is  fully 
consistent  -  inserting  the  exact  solution  (u,p),  we  see  that  the  GLS  terms 
disappear,  for  any  value  of  a. 

We  have 
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Theorem  10.2.  [54],  Let  ft  C  IR2  be  a  polygon  and  T  be  a  shape-regular 
mesh.  Then  there  exists  C  >  0  independent  of  a,  k  and  hx  such  that  Ba  in 
(10.15)  is  stable,  more  precisely  that 

„lln  _  Ba(u,p-y,q)  ^  a 

o/*eV„  (||u||? 

o^ggMjv 

for  all  0  ^  u  G  Vat,  0  ^  p  €  Mm- 

Remark  10.3.  The  above  result  does  not  allow  for  anisotropic  rectangles  -  it 
does  allow,  however,  curved  elements,  i.e.,  nonaffine  patch  maps  Fp.  GLS 
stabilization  on  curved ,  anisotropic  meshes  is  open  at  present.  For  more  in¬ 
formation  on  GLS  methods,  we  refer  to  [38]  and  the  references  there. 

Remark  10. 4-  The  original  GLS  methods  were  developed  for  k  =  1,  so  that 
the  domain  integrals  in  Ba  would  simplify.  The  evaluation  of  second  order 
derivatives  of  high  order  polynomials  in  stabilization  terms  in  Ba  is  costly  in 
the  element  stiffness  matrix  evaluation  of  (10.15). 

10.5  Numerical  experiments 

Implementational  details.  Here  we  present  some  numerical  results,  taken 
from  [29],  to  show:  a)  that  hp- FEM  give  exponential  convergence  even  if 
the  solution  has  singularities  and  b)  to  compare  the  pure  Galerkin  approach 
with  divergence  stable  elements  with  the  GLS  approach  and  equal  order  hp- 
interpolation.  Accordingly,  we  compare 

The  Galerkin  formulation  (GFEM):  Let 

Vn  =  S*’1(T)2,  Mn  =  Sq-2,0(T). 

The  GFEM  is  to  find  (um,Pn)  €  Vjv,o  x  Mm, o  such  that 

B0{uN,PM]v,q)  =  Fo(v,  q)  for  all  (v,g)  G  Vj vx  MN- 

The  Galerkin  Least  Squares  formulation  (GLSFEM): 

Let  a  >  0  and 

Viv  =  50M(r)2,  Mn,o  =  Sk,1{T). 

The  GLSFEM  is  to  find  {um,pm)  G  Vn  x  Mm  such  that 

Ba(uN,pN-,v,q)  =  Fa(v,q)  for  all  (v,q)  GV^x  Mn ■ 

Note  that  we  consider  a  continuous  pressure  approximation  in  the  GLS¬ 
FEM  while  the  pressure  is  discontinuously  interpolated  in  the  GFEM.  This 
choice  has  been  made  since  it  points  out  the  principal  advantages  of  imple¬ 
menting  GLSFEM:  In  the  GLSFEM  velocity  and  pressure  degrees  of  freedom 


,n  +  IWM1/2(llvll^  +  lklM1/2 


A*4 

'''max 
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are  treated  in  exactly  the  same  way.  For  the  GFEM  implementational  diffi¬ 
culties  arise  if  one  enforces  different  polynomial  degrees  for  the  velocity  and 
the  pressure  and  different  interelement  continuity  requirements  for  and 
PN- 

Our  hp- FE  implementation  for  the  Stokes  problem  is  based  on  HP90,  a 
flexible  FE  code  for  general  elliptic  problems  in  Fortran  90  [21].  HP90  allows 
for  isotropic  and  anisotropic  mesh  refinements,  both  h-  and  p-refinements.  In 
particular,  h-refinements  can  lead  to  irregular  meshes  with  hanging  nodes. 
HP90  is  designed  to  handle  such  meshes  and  enforces  the  appropriate  con¬ 
tinuity  requirements  by  constraining  these  irregular  nodes.  We  refer  to  [21], 
[50]  for  a  detailed  description  of  the  constraining  procedure. 

In  our  numerical  examples  we  use  quadrilateral  finite  elements  to  dis¬ 
cretize  the  domain  fl.  Implementationally,  the  elemental  polynomial  degrees 
kx  are  further  split  into  edge  and  internal  degrees  that  can  vary  within  the 
element,  i.e.  kx  is  to  be  understood  as  the  vector  kx  =  {k^,  k\,  k?K,  k^,  k^}. 
Here  klK,  i  =  1, . . .  ,  4,  is  the  polynomial  degree  on  the  i-th  edge,  and  ksK  the 
polynomial  degree  in  the  interior  of  the  element.  The  nodes  alK , . . .  ,a9K  cor¬ 
respond  to  kx,  where  alK, . . .  ,  a4K  denote  the  vertex  nodes,  a5K, . . .  asK  the 
mid-side  nodes  and  is  the  middle  node.  This  is  shown  schematically  in 
Figure  10.1  for  the  reference  square  Q  =  (0,  l)2.  The  shape  functions  that  are 
associated  with  the  nodes  ax  of  the  reference  element  are  the  nodal  based 
Lagrange  shape  functions  but  other  shape  functions  can  be  used  as  well  (cf. 
[21]). 


6 


Fig.  10.1.  Quadrilateral  reference  element  Q  with  nodes  (o1, . . .  ,  a9). 

In  the  case  of  the  GLS  method  we  need  to  interpret  the  reference  element 
Q  as  a  vector  valued  reference  element,  i.e.  we  use  the  shape  functions  and 
degrees  of  freedom  (dof)  that  correspond  to  Q  to  approximate  each  velocity 
component  and  the  pressure.  The  least-squares  stabilization  term  in  Ba  in 
(10.15)  involves  second  derivatives  and  therefore  we  also  need  the  second 
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derivatives  of  the  reference  element  shape  functions  ,  £2)  with  respect  to 
the  physical  coordinates  (x\ ,  x2)  =  Fk{S,\  ,  &)•  In  the  case  of  an  affine  element 
mapping  Fk  ,  the  chain  rule  gives 


d2<p  _  9V  ( d£i\ 2  9V  ( \ 

dx\  dtf  \dxj  di\  \dxi) 


(10.16) 


But  the  terms  d£j  / dxt  are  not  constant  in  the  case  of  a  general  (e.g.  bilinear) 
element  mapping  which  leads  to 


d2<p  _  dVd£i  d^pd&  dp  926 

dxf  d£\  dxt  d£i  dx?  dt£  dx j  d£2  dxj 


The  terms  d£j/dxi  and  d2£j/dxf  are  rational  functions  and  can  thus  not  be 
integrated  exactly,  but  the  use  of  a  higher  order  integration  rule  reduces  the 
error  in  the  element  computations.  Nevertheless,  the  element  computations 
are  completely  standard  and  for  an  element  K  the  local  element  stiffness 
matrix  Ek,u  and  load  vector  Fk,o  result  in  an  element  system  of  equations 
that  is  of  the  well  known  form 


Aa  0  B^a  u\  Fi<a 

EK,a  "  =  0  Aa  Bj<a  u2  =  F2,a  (10.18) 

Fi,<*  aM  J  L  P  J  0  . 


where  u  =  (iq, u2),  Aa,  Bit0,  B2<a  as  well  as  M  correspond  to  the  usual 
velocity  and  pressure  combinations  in  (10.13)  and  XT  it  the  transpose  of  X. 

In  the  context  of  geometric  refinements  with  irregular  nodes  we  have  to 
modify  Ek,oc  in  order  to  account  for  these  irregular  nodes.  HP90  is  designed 
to  enforce  the  appropriate  constraints  automatically  on  the  local  element 
stiffness  matrix  and  load  vector.  This  procedure  [21]  results  in  a  modified 
local  stiffness  matrix  Ex,a  that  corresponds  to  the  actual  globally  existing 
dof.  This  modified  matrix  Ek,<x  can  then  be  assembled  to  obtain  the  global 
stiffness  matrix. 

In  the  case  of  the  G  method  the  situation  is  somewhat  more  complicated 
due  to  the  different  approximation  orders  for  the  velocities  and  the  pressure 
and  additionally  the  pressure  being  discontinuous.  Here  we  use  the  shape 
functions  of  order  k  on  Q  to  approximate  the  velocity  components  and  the 
shape  functions  of  order  k  —  2  to  approximate  the  pressure.  The  dof  for  the 
velocity  components  are  interpreted  in  the  standard  way  but  the  pressure 
dof  are  now  all  interpreted  as  dof  that  belong  to  bubble  shape  functions, 
although  the  shape  functions  of  order  k  —  2  contain  vertex  and  side  shape 
functions.  It  is  obvious  that  the  number  of  bubble  shape  functions  of  order 
k  on  Q  is  exactly  the  same  as  the  total  number  of  shape  functions  of  order 
k  —  2.  This  motivates  to  interprete  Q  as  a  vector  valued  reference  element 
with  two  components  for  the  vertex  and  side  dof  and  three  components  for 
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the  bubble  dof  and  the  shape  functions  being  chosen  as  described  above.  The 
element  computations  for  the  G  method  are  then  again  standard  but  we  have 
to  consider  the  unusual  element  definition.  The  local  element  stiffness  matrix 
Ek  is  of  a  form  similar  to  (10.18)  with  a  =  0.  We  further  emphasize  that  we 
do  not  need  the  second  derivatives  of  the  shape  functions  to  compute  Ek- 
For  an  irregular  mesh,  we  again  have  to  modify  Ek  to  a  local  matrix  Ek  that 
corresponds  to  global  dof.  But  now  we  apply  the  constraints  only  to  the  ve¬ 
locity  components  because  the  pressure  may  be  discontinuous  across  element 
boundaries.  The  element  matrices  Ek  are  then  assembled  in  principle  in  the 
usual  way  but  the  non  standard  element  definition  requires  a  generalization 
of  the  assembling  procedure  to  account  for  the  presence  of  continuous  and 
discontinuous  field  variables. 

In  both  the  G  &  GLS  method  we  have  to  enforce  Dirichlet  boundary 
conditions  that  correspond  to  the  boundary  values  of  the  exact  solutions.  The 
standard  procedure  very  often  used  in  practice  is  to  interpolate  the  boundary 
data  at  equidistant  points,  but  this  procedure  is  known  to  be  numerically 
instable  for  higher  approximation  orders.  In  connection  with  higher  order 
methods  interpolation  at  the  Gauss  Lobatto  points  is  better  suited  (cf.  [21]). 
We  enforce  the  Dirichlet  data  for  the  G  &  GLS  method  in  exactly  the  same 
way  at  the  element  level. 

Although  we  apply  Dirichlet  boundary  conditions  to  the  velocity  com¬ 
ponents,  the  global  stiffness  matrix  is  not  invertible  in  both  formulations, 
because  the  constant  pressure  mode  is  still  not  eliminated.  To  obtain  invert- 
ibility  of  the  global  system  we  fix  the  pressure  at  one  dof.  Then  the  global 
system  can  be  solved  and  we  only  have  to  postprocess  the  pressure  so  that 
the  mean  value  is  zero,  i.e.  so  that  the  pressure  is  an  element  of  Lq{Q). 

Numerical  results  for  G  &  GLS  hp  FEM.  In  the  following  we  first 
describe  the  two  model  problems  that  we  use.  Both  model  problems  have 
exact  solutions  and  therefore  allow  for  a  numerical  convergence  study.  These 
two  exact  solutions  have  significantly  different  characteristics,  i.e.  one  solution 
is  smooth  and  the  other  one  has  a  corner  singularity  at  the  reentrant  corner. 
These  two  model  problems  are  well  suited  for  a  comparison  of  the  G-  and 
GLS-  hp-FEM. 

In  our  numerical  results  we  present  always  the  relative  errors  that  we 
obtained  with  our  hp- FE  implementation.  We  show  only  the  errors  for  the 
first  velocity  component  (the  results  for  the  second  one  being  completely 
similar)  and  the  pressure.  The  velocity  error  is  computed  in  the  U1-norm 
and  the  pressure  error  in  the  L2-norm.  In  order  to  be  consistent  with  the 
pressure  being  in  L\ ,  we  subtract  the  mean  value  from  the  exact  pressure  p 
and  the  numerical  pressure  p,v  ,  i.e.  we  subtract  terms  of  the  form 
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and  the  relative  error  in  the  pressure  is  computed  as 

lb-PllL2(fi) 

The  relative  Hl-enor  in  the  velocity  components  is  computed  in  the  standard 
way.  We  remark  finally  that  the  Gauss  integration  rule  that  we  use  to  compute 
the  errors  is  of  significantly  higher  order  than  the  integration  rule  in  the 
element  computations. 

Model  problems.  In  our  model  problems  we  consider  the  Stokes  equation 
(10.6),  (10.7)  with  viscosity  v  =  1  in  the  L-shaped  domain  Q  shown  in  Figure 
10.2.  Such  domains  appear  also  in  the  backward  facing  step  flow  problem  or  in 
the  so-called  4:1  contraction  problem.  On  Q  we  use  geometric  meshes  Tn+1,tr 
with  n  +  1  layers.  Such  a  mesh  (with  irregular  nodes)  is  shown  for  a  grading 
factor  a  =  0.5  in  Figure  10.2. 


xi 


Xl 


Fig.  10.2.  L-shaped  domain  12  and  a  geometric  mesh  on  (2. 


We  use  two  exact  solutions  (ui,pi)  and  (u2,P2),  the  first  one  exhibiting 
corner  singularity  phenomena  at  the  reentrant  corner  0,  the  second  one  being 
analytical  in  fl  (including  the  corners).  In  polar  coordinates  (r,  ip)  at  the 
origin  the  first  exact  solution  is  given  by 


,  ,  _  rA  / 11  +  A)  sm{(p)$(ip)  +  cos (<p)&'(<p)  \ 

1  ysin^g''^)  -  (1  +  A)  cos (ip)iP((p)  J  ’ 


(10.21) 


Pi  =  -rA“1[(l  +  A)2r(<?)  +  #"'(¥>)]/(  1  -  A)  (10.22) 


<P(<p)  =  sin((l  +  A )ip)  cos(Aw)/(l  +  A)  -  cos((l  +  A )y>)— 
sin((l  —  A )ip)  cos(Aw)/(l  —  A)  +  cos((l  —  A )<p), 
37T 


with 


/ip-FEM  for  Fluid  Flow  Simulation  409 


The  exponent  A  is  the  smallest  positive  solution  of 


sin(2Aw)  +  A  sin(2w)  =  0, 


(10.23) 


which  is  A  as  0.5444838205973307.  This  solution  satisfies  the  homogeneous 
Stokes  equation,  i.e.  —Au\  +  Vpi  =  0  in  12,  and  we  have  ui  =  0  on  the 
segments  T),  r2  shown  in  Figure  10.2.  We  emphasize  that  (ui,pi)  is  analytical 
in  /2\{0},  but  Vui  and  pi  are  singular  at  the  origin.  Especially,  uj  ^  lP(Q)2 
and  pi  £  j Ff1  (12).  This  first  solution  reflects  perfectly  the  typical  (singular) 
behavior  of  solutions  of  the  Stokes  equations  near  reentrant  corners  and  is 
generic  (compare  with  (3.1)). 

The  second  exact  solution  we  use  is  somehow  artificial,  since  it  is  analytic 
in  12  (including  the  corners).  In  practice,  one  can  not  expect  solutions  to 
behave  so  nicely  at  reentrant  corners.  Nevertheless,  smooth  solutions  arise 
for  example  in  smooth  domains  and  it  is  hence  reasonable  to  validate  the 
numerical  performance  for  such  exact  solutions  too.  We  take 


u2  (x,y) 


{  -  exp(aO[j/  cos  (y)  +  sin  (j/)] 
\exp{x)ysm(y) 


(10.24) 


p2  =  2  exp  (a:)  sin(y). 


(10.25) 


As  above,  -du2  +  Vp2  =  0. 


Choice  of  stabilization  parameter  a.  Theorem  10.2  guarantees  stability 
of  the  GLSFEM  as  long  as  the  parameter  a  remains  in  a  range  0  <  a  <  amax . 
amax  is  independent  of  the  element  sizes  hx  and  the  approximations  orders 
kx  and  is  essentially  determined  by  the  best  constant  C  for  which  the  inverse 
inequality 

\\V<P\\L2{K)  <  Ck2  |M| LH&)  (10.26) 


holds  on  the  reference  element  K  for  all  polynomials  <p  €  Sk  (K)  and  all  k  6  IN 
(cf.  [54]).  In  one  dimension  the  best  constant  C  in  (10.26)  is  explicitly  known 
and  equal  to  3\/2  (if  K  =  (—1, 1)).  In  two  space  dimensions  this  best  constant 
seems  not  to  be  available,  but  we  expect  it  to  be  of  about  the  same  order.  In 
addition,  one  may  ask  whether  this  upper  bound  amax  is  just  an  artefact  of 
the  stability  proof  or  whether  it  can  really  be  observed  in  practice.  On  the 
other  hand,  we  expect  the  GLSFEM  to  become  instable  as  a  approaches  0. 
In  fact,  for  a  =  0  the  G-  and  GLS-discretization  coincide  and  it  is  well  known 
that  the  Galerkin  method  is  instable  for  velocity  and  pressure  spaces  of  the 
same  polynomial  order. 
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Fig.  10.3.  Dependence  of  the  relative  error  on  stabilization  parameter  a. 


We  addressed  these  questions  numerically  by  varying  a  in  a  large  range. 
We  considered  two  configurations  for  the  model  problem  (10.21)-(10.22)  in 
the  L-shaped  domain,  the  first  one  being  k  =  4,  n  —  4  and  cr  =  0.5,  the 
second  one  k  =  8,  n  =  10  and  cr  =  0.5,  where  k  is  the  polynomial  degree 
and  n,  a  determine  the  geometric  mesh  T",<T  with  n  +  1  layers  and  grading 
factor  a.  In  Figure  10.3  the  relative  errors  of  the  first  velocity  component 
and  the  pressure  are  plotted  for  these  two  configurations  against  a  ranging 
from  10“10  to  1010.  The  error  curves  become  oscillatory  for  increasing  a.  The 
“existence”  of  an  upper  bound  amax  can  not  be  answered  affirmatively  with 
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absolute  certainty.  Anyway,  the  performance  of  the  GLSFEM  is  rather  poor  in 
the  range  a  >  10°.  But  the  deterioration  of  the  GLS-scheme  as  a  approaches 
zero  can  indeed  be  observed:  The  errors  begin  to  grow  and  finally  explode 
for  a  <  10-5.  In  this  range  the  velocities  are  still  more  or  less  accurate  but 
the  obtained  pressures  become  strongly  oscillatory.  This  phenomena  (already 
mentioned  in  [38])  is  to  be  expected  since  the  pressure  terms  are  in  fact  the 
terms  that  are  stabilized.  We  see,  however,  that  good  results  are  obtained 
for  a  6  (10~6, 10°)  which  depend  weakly  on  the  particular  value  of  a  in  this 
range  (see  Figure  10.4).  We  conclude  that  in  practice  the  precise  value  of  a 
is  not  critical  to  the  accuracy,  as  long  as  the  dependence  on  hx  and  kx  are 
accounted  for  properly. 

In  all  our  numerical  results  that  follow  we  use  a  =  0.1. 


Numerical  experiments  for  the  smooth  solution.  In  Figure  10.5  we 
present  convergence  rates  for  the  h—  and  p— version  G  &  GLSFEM  that 
we  obtained  by  approximating  the  smooth  solution  (10.24)-(10.25)  to  the 
Stokes  problem.  In  the  h— version  we  use  uniform  meshes  and  expect  algebraic 
convergence  rates. 

The  approximation  order  for  the  velocity  is  choosen  to  be  cubic  and  this 
implies  a  linear  approximation  of  the  pressure  in  the  G  method.  We  start 
with  3  elements  in  the  L-shaped  domain  and  uniformly  /t-refine  the  mesh. 
Note  that  the  meshwidth  h  is  given  by  CN*,  where  N  is  the  number  of  dof. 
It  is  evident  from  Figure  10.5  that  the  h-version  yields  algebraic  convergence 
of  order  2  for  the  G  method.  For  the  GLS  method  the  ft— version  convergence 
rate  is  3,  which  is  optimal. 

Since  the  exact  solution  (10.24)-(10.25)  is  analytic  in  J?,  we  expect  expo¬ 
nential  convergence  of  the  p— version.  We  start  again  with  a  3  element  mesh 
and  increase  the  polynomial  approximation  order  k  from  3  to  8  for  the  veloc¬ 
ity.  Here,  we  have  p  ps  N*.  The  convergence  rates  displayed  in  Figure  10.5 
indicate  the  exponential  convergence  of  the  G  &  GLS  FEM  for  this  smooth 
solution. 


Numerical  experiments  for  the  singular  solution.  In  this  section  we 
present  numerical  results  for  the  first  solution  (10.21)-(10.22).  We  recall  that 
the  solution  has  a  singularity  at  the  reentrant  corner.  Therefore,  it  is  necessary 
to  perform  mesh  refinements  towards  the  singularity  in  order  to  capture  its 
singular  behavior.  In  Figures  10.6  to  10.9  we  present  convergence  rates  that 
correspond  to  meshes  of  affine  elements  that  have  been  refined  geometrically 
towards  the  reentrant  corner  with  grading  factor  a  =  0.5.  An  example  of  such 
a  mesh  is  displayed  in  Figure  10.2.  This  mesh  contains  l  —  8  layers  of  elements, 
which  have  been  generated  by  successively  refining  3  initial  elements.  The 
irregular  nodes  in  this  mesh  are  constrained  automatically  by  HP90. 

In  Figures  10.6  and  10.7  we  show  the  performance  of  the  p— version  FEM 
(resp.  the  spectral  method)  by  fixing  a  grid  with  l  layers,  and  increasing  the 
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polynomial  approximation  order  k  from  3  to  8.  As  to  be  expected,  the  graphs 
indicate  algebraic  rates  of  convergence  which  in  fact  are  very  close  to  the 
a-priori  bound  of  k0  5~2X  an  Ar0-25_A,  where  A  is  the  constant  in  (10.23).  This 
a-priori  bound  is  optimal  in  view  of  [4]  and  the  fact  that  the  inf-sup  constant 
7jv  in  (10.9)  is  Ck~0  5  in  the  G  method  for  the  elements  chosen  here  [69]. 


Fig.  10.4.  Dependence  of  the  relative  error  on  a  . 


In  Figure  10.7  the  same  plot  is  depicted  for  the  GLSFEM  and  shows  a 
convergence  similar  to  the  GFEM.  This  indicates  that  the  dependence  of  the 
inf-sup  constant  on  the  approximation  order  in  Theorem  10.2  is  probably 
suboptimal. 
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Fig.  10.5.  h—  and  p— version  G  &  GLS  FEM  convergence  rates. 


For  the  hp— version  we  show  the  convergence  rates  in  Figures  10.8  and 

10.9.  Here  we  do  not  only  vary  the  polynomial  degree  but  also  the  grid,  i.e. 

the  number  of  layers  in  the  mesh.  We  do  this  with  respect  to  the  parameter 

u,  where  ,  , 

l=[p-k\.  (10.27) 

Again,  here  l  is  the  number  of  layers  and  k  the  polynomial  degree.  The  hp 
convergence  rates  for  various  parameters  p  indicate  the  exponential  conver¬ 
gence  of  the  hp— version,  as  expected  and  predicted  by  Theorems  5.8  and 

5.9. 
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Pressure 


Fig.  10.6.  p— version  GFEM  convergence  rates  for  geometric  meshes  with  hanging 
nodes. 


The  affine  geometric  meshes  with  hanging  nodes  are  obtained  by  bisecting 
elements  in  the  middle,  which  results  in  a  mesh  grading  factor  of  a  =  0.5. 
With  this  a  we  obtain  reliable  results,  but  recall  from  Remark  5.11  that 
a  =  0.5  is  not  optimal.  To  study  the  dependence  on  er,  we  use  geometric 
meshes  with  variable  order  elements  that  have  bilinear  element  mappings. 
An  example  mesh  with  geometric  refinement  toward  the  reentrant  corner 
with  a  =  0.3  and  8  layers  of  elements  is  shown  in  Figure  10.10  (the  elements 
in  the  layers  at  the  reentrant  corner  are  so  small  that  they  are  not  visible  in 
Figure  10.10). 
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#  Degrees  of  Freedom 


Fig.  10.7.  p— version  GLSFEM  conv.  rates  for  geometric  meshes  with  hanging 
nodes. 


We  demonstrate  the  dependence  of  the  GFEM  performance  on  the  geo¬ 
metric  mesh  grading  in  Figure  10.11.  The  ftp— version  GFEM  is  converging 
exponentially  for  all  values  of  a  on  these  geometric  meshes  in  accordance 
with  Theorem  5.9.  Further,  the  performance  is  best  for  a  =  0.15  and  a  =  0.2, 
which  are  very  close  to  the  optimal  a  in  one  dimension  (see  Remark  5.11).  In 
particular,  for  a  =  0.5  the  error  is  about  one  order  of  magnitude  larger  than 
for  the  optimal  grading  factor  0.15.  The  best  result  with  a  =  0.5  is  obtained 
with  N  fa  5000  while  for  a  =  0.15  the  same  accuracy  is  already  obtained  with 
1500  dof.  This  underlines  the  importance  of  refining  towards  the  singularity 
with  the  grading  factor  a  =  0.15. 
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Fig.  10.8.  hp— version  GFEM  conv.  rates  for  geometric  meshes  with  hanging  nodes. 


10.6  Almost  incompressibility 

We  return  to  the  elasticity  problem  (9.1)  and  assume  that 

M  M  M  >  0  >  7  »  1 ,  (10.28) 

i.e.  the  medium  is  almost  incompressible.  Both,  mixed  and  stabilized  meth¬ 
ods,  are  able  to  handle  the  limiting,  incompressible  case  7  —►  00  and  are,  in 
fact,  robust  with  respect  to  7. 

In  what  follows,  we  assume  that 

e  :=  2p/7  G  [0, 1]  . 


(10.29) 
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Fig.  10.9.  hp— version  GLSFEM  conv.  rates  for  geometric  meshes  with  hanging 
nodes. 


For  s  >  0,  introduce  in  (9.1),  (9.3)  a  new  variable  p  by 

ep  :=  -V  ■  u  (10.30) 

and  obtain  the  saddle-point  form  of  (9.1),  (9.3) 


-2div(/xD(u))  +  2 pS/p  =  S  in  Q  , 
V  •  u  +  ep  =  0  in  Q  . 


(10.31) 
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Fig.  10.10.  Geometric  mesh  with  8  layers  of  elements. 

Mixed  hp- FEM.  The  boundary  conditions  are 

u  =  0  on  To  , 

,  ,  x  (10.32) 

2/x(D(u)  —  pl)n  =  g  on  TN  . 

The  variational  formulation  of  (10.31),  (10.32)  reads: 
find  (u ,p)  €  H]j{Q)d  x  L2(J?)  such  that  Vv  G  J?)d 

2(/xD(u),  D(v))u  -  2 (up,  V  •  v)«  =  (S,  v)j?  +  (g,  v)rN  ,  (10.33) 

(V  •  u ,q)n  +  {ep,  q)n  =  0  'iq  G  L2(Q) .  (10.34) 

The  discretization  proceeds  by  choosing  subspaces  V)v  C  Hp(Q)d,  Mn  C 
L2 (fl):  find  u n,Pn  G  V/\t  x  Mn  such  that  (10.33),  (10.34)  hold  for  all  v,q  G 
Vn  x  Mn- 

Theorem  10.5.  Let  (Vn,  Mn)  satisfy  the  discrete  inf-sup  condition  (10.10), 
with  constant  jn  >  0.  Then  the  bilinear  form 

B{u,p ;  v,  q)  :=  {2(juD(u),  D(v))n 

-ip,  V  •  v)n  +  (V  •  u,  q)n  +  (ep,  q)n} 


(10.35) 
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Fig.  10.11.  ftp— version  GFEM  convergence  rates  for  geometric  meshes  with  p  =  1 
and  varying  a. 


satisfies,  for  any  e  >  0, 


inf 

O^uGVjv 

O^pGMjv 


sup 

o#veVjv 

Oy^eMjv 


B(u,p\  y,q) 
lll(«,P)llllll(«, 9)111 


>  Cfi 


where  C  >  0  is  independent  of  e,  N,  p  and 


lll(«,P)lll2  :=  IIVmII2  +  llpll2  . 


(10.36) 
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Proof.  Let  O^ueVjv,  M pf.  W.l.o.g.  assume  p  =  1/2.  Then 

B(u,p;  u ,p)  =  ||D(u)|j2  +  \\y/ep\\2  >  CK\\Vu ||2 

by  Korn’s  inequality  (9.10),  for  any  e  >  0. 

By  (10.10),  for  any  0  ^  p  G  MN  there  exists  wp  G  Vat  such  that: 

llVwpll  =  ||p||,  -(p,V  ■  wp,p)  >  7jv|H|2  • 

Hence,  for  any  6  >  0  we  get 

B(u,p;  wp,0)  >  7 N  ||p||2  -  ||D(u)||  ||D(wp)|| 
>7n|H!2-C||Vu||  IIVwpll 
=  In  ||p|!2  —  C  ||Vu||  ||p|| 

>7n|W!2-§||Vu||2-^||p||2 

—  (jN  —  CO/2)  ||p||2  —  y  ||Vu||2  . 

Let  8  >  0  and  put  v  :=  u  +  6wp,  q  —  p.  Then 

B(u,p]  v,q)  =  S(u,p;  u,p)  +  5J3(u,p;wp,0) 

>  (CK  -  SC/26)  ||Vu||2  +  8( 7at  -  C0/2)  \\p\\2  . 

Pick  0  =  jn/C  and  5  —  Ck  In /C2  to  get 

B(u,p;v,g)  >  ^  ||Vu||2  +  •  y-  ||p||2 

>  min(l,7^/C2)  |||(u,p)|||2  . 

Since  0  <  7at  <  7,  there  is  a  constant  C  >  0  independent  of  e,  N 

lll(v,«)|||  <  |||(u,p)|||  +  5||Vwp||  =  1 1 1 (u, p) 1 1 1  +  Ck1nC~2  ||p|| 

^IIKimOHI.  n 

Choosing  Vjv  and  Mn  as  in  the  Stokes  problem,  for  example,  for  kx  >  2, 
KgT, 

VN  =  S%L{Q,T)d,  MN  =  Sk~2'°(f2,T)  (10.37) 


with  T  denoting  a  geometric  boundary  layer  mesh,  gives  discretizations  of 
(10.33),  (10.34)  which  are  uniformly  stable  as  e  — >  0  (i.e.  as  7  — >  00).  In 
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particular,  the  conditioning  of  the  stiffness  matrix  corresponding  to  (10.33), 
(10.34),  which,  for  constant  7  and  p  reads, 


/2p  A  -2pBr\ 

\2fiB  eMJ 

is  independent  of  e  and  there  holds  the  stability  inequality 
2p(||ujv||i,fl  +  ||pjv||o,fi)  <  C (||S||0)n  +  Hgllo.rp,) 


(10.38) 


(10.39) 


where  C  >  0  is  independent  of  e  >  0  (it  depends  only  on  the  inf-sup  constant 
7jv  in  (10.10));  for  a  proof  we  refer  for  example  to  [16],  Chapter  II.  Note  that 
for  e  >  0  the  variable  p  in  (10.33),  (10.34)  is  not  related  to  the  hydrostatic 
pressure  in  (1.2). 


GLS-stabilized  /ip-FEM.  The  hp-FEM  for  (10.33),  (10.34)  requires  again 
elements  of  different  order  for  V /v  and  if  robustness  w.r.  to  e  =  2/i/q 
is  to  hold.  Equal  order  elements  cannot  achieve  robustness.  The  remedy  is 
again  a  GLS  stabilization:  find  (u n,Pn)  £  ~Vn  x  Mn  such  that 

Ba(uN,pN ;  v,q)  =  Fa(y,q)  V(v,g)  G  VN  x  MN  (10.40) 

where  0  <  a  <  «o  is  a  stabilization  parameter  and  (see  (10.15)) 

Ba(u,p-,  v,q)  := 

{2(pD(u),  D(v))n  -  (p,V-v)n  -  (V -11,3)0  -  (ep,q)n]  /in  ... 

(10.41) 

"a  E  if  (-2(V-/iD(u)  +  Vp)jf>-2(V-/iD(v)  +  V3))jC 

kzt 

and 


Fa (v , q)  :=  (S,v)o  +  (g ,v)rN 

-a  E  S  (S, -2p(V.D(v)  +  Vg))K- 

KeT 


(10.42) 


Remark  10.6.  The  triangulation  T  in  (10.40)  must  be  shape  regular,  no  GLS 
stabilization  for  anisotropic  elements  is  known. 


Remark  10.7.  In  (10.41),  (10.42),  we  assumed  that  p\x  =  const,  for  all  K  G 

T. 


Remark  10.8.  Note  that  (10.41)  is  fully  consistent  -  inserting  the  exact  solu¬ 
tion  (u,p),  the  least  squares  terms  cancel. 

The  formulation  (10.41)  is  stable  uniformly  in  e: 
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Theorem  10.9.  There  is  ao  >  0  independent  of  hx,  kj<  such  that  for  0  < 
a  <  cto  and  all  0  <  e  <  1  it  holds  for 

VN  =  S%1(n,T)d,  MJV  =  Sk>1(/2,T) 

(equal  order,  continuous  elements): 


inf 

o^ueVjv 

0#p6Mjv 


SUP  TITW 

o#veVw  (Jluili 

OjtqeMN 


_ Ba{u,p\  v,q) _ 

+  (i+e)IWI§)1/2(IMIf  +  (i  +  e)W)1/2 


>  Ca 


where  C  >  0  depends  only  on  p  and  on  the  shape-regularity  of  T ■ 
The  proof  is  a  slight  modification  of  [27]  and  omitted  here. 


10.7  Advection  dominated  compressible  (elastic)  flow 

So  far,  we  discretized  only  the  “elliptic”  part  of  (1.2),  incompressible  or  elas¬ 
tic.  Now  we  include  advection  terms  of  the  left  hand  side  of  (1.2)  into  the 
problem  and  consider  the  compressible  Oseen- equations: 

— div  r(u)  +  w  •  Vu  =  S  in  1?,  (10.43) 

u  =  0  on  dfi.  (10.44) 

where  again,  with  e  as  in  (10.29)  and  p,7  constant, 

r(u)  =  2/u|d(u)  +  i  V  ■  u j  .  (10.45) 

For  s  =  0  we  obtain  the  incompressible  limit,  i.e.  the  Oseen  equations  (10.5). 
Mixed  boundary  conditions  like  (10.32)  can  also  be  posed  instead  of  (10.44); 
for  ease  of  notation  we  develop  the  methods  for  (10.44). 


Advection  stabilized  mixed  hp-FEM.  In  (10.43)  we  have  2  effects:  a) 
advection  dominance,  i.e.  |w|  large,  and  b)  near  incompressibility,  i.e.  e  — >  0. 
We  handle  the  latter  by  adopting  a  mixed  formulation  as  in  (10.31),  and  the 
former  by  GLS  stabilization  as  in  the  hp-SDFEM  in  Section  8.  We  will  see 
that  the  resulting  method  is  stable  independent  of  the  advection  size  and 
the  incompressibility  constraint  on  high  aspect  ratio  (k,  a)-boundary  layer 
meshes. 

Using  (10.30)  (note  that  p  =  -e_1  V  ■  u  is  not  related  to  the  pressure  in 
(1.2)),  we  get  in  (10.42)  with  (10.45)  the  system 

-2  divpD(u)  +  2  V/xp  +  w  •  Vu  =S  in  Q ,  (10.46) 


V  •  u  +  ep  =  0  on  Q  . 


(10.47) 
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The  stabilized  saddle  point  formulation  reads:  find  u  G  Vn,  P  £  Mjv  such 
that 

Ba(u,p;v,q)  =  Fa(v,q)  V(v, q)  G  VN  x  MN  (10.48) 

where  Vjv  and  Mjv  is  a  pair  of  stable  spaces  for  the  Stokes  Problem,  as  e.g. 

VN  =  S%’1{f2,T)d,  MN  =  Sk~2’°(n,T)  (10.49) 

and  we  define,  for  a  parameter  a  >  0  and  with  8k  as  in  (8.10),  the  forms 


Ba(u,p\  v,  q)  :=  2(/rD(u),  D(v))fi  -  (p,  V  •  v)fl 

+  (V  -u ,q)n  +  ( ep,q)n  +  \  {(w  ■  Vu,v)fi  -  (u,w  •  Vv)o} 

+  a  '^2  <5ff(-2(div  pD(u)  -  /iVp)  +  w  •  Vu,  w  •  Vv)x 
K 

Fa(v,  q)  :=  (S,  v)fl  +  a  ^  tfjc(S,  w  •  Vv)K  . 

K 


(10.50) 


(10.51) 


Remark  10.10.  In  (10.49),  (10.50)  the  stabilization  is  fully  consistent  once 
more,  notice,  however,  that  now  only  the  advection  term  w  •  Vv  has  been 
stabilized.  Stabilization  of  the  incompressibility  condition  is  not  needed  if 
either  e  =  1  or  if  the  pair  (Vjv,  MN)  is  stable  for  the  Stokes  Problem,  as  e.g. 
(10.49).  In  particular,  by  the  stability  of  (10.37)  in  (10.33),  (10.34)  on  geo¬ 
metric  boundary  layer  meshes,  and  with  the  choice  (8.10)  of  the  <5^  (10.48) 
is  stable  on  geometric  boundary  layer  meshes  also  for  advection  dominated 
flow.  By  Theorem  10.9,  (10.48),  (10.49)  will  also  work  for  the  Oseen  problem 
(10.5),  i.e.  for  e  =  0. 

Remark  10.11.  In  (10.49),  (10.50)  we  used  divergence  stable  mixed  elements 
and  stabilized  the  method  only  toward  the  advection  term  w  •  Vu.  One  can, 
however,  also  include  additional  stabilization  to  accommodate  equal  order 
elements  for  velocity  and  pressure,  i.e.  stabilize  also  against  divergence  insta¬ 
bility.  This  is  done  in  [30],  [74]. 


11  hp-  time-stepping 

All  discretizations  considered  so  far  addressed  the  spatial  parts  of  (1.1)  - 
(1.3),  ignoring  the  time  derivative  altogether.  Here  we  address  the  time- 
discretization  of  (1.1)  -  (1.3).  We  semidiscretize  the  system  (1.1)  -  (1.3)  in 
time,  thereby  reducing  it  to  a  sequence  of  nonlinear,  convection  dominated 
elliptic-hyperbolic  systems  in  space  which  are  of  the  type  considered  above. 
Thus,  our  approach  is,  in  a  sense,  complementary  to  the  usual  method  of  lines. 
Many  time  stepping  schemes  have  been  proposed  in  the  literature  based  on 
schemes  from  initial-value  ODEs  and  we  do  not  want  to  survey  them  here  (see 
eg.  [35]).  We  merely  observe  here  that  all  of  the  schemes  in  [35]  are  based  on 
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Taylor  expansions  in  time  and  yield  error  estimates  of  order  0(Atr),  r  >  1, 
for  solutions  smoothly  depending  on  t,  as  the  time  step  At  ->  0;  examples 
are  the  classical  Runge-Kutta  methods  or  the  multistep-methods.  In  none  of 
these  methods,  error  bounds  that  are  explicit  in  the  order  k  are  usually  avail¬ 
able  and  if  so,  they  do  not  allow  to  deduce  spectral  convergence  for  smooth 
solutions.  The  error  analysis  of  low  order  methods,  in  particular  for  viscous, 
incompressible  flow,  has  reached  some  maturity  by  now  (see  [51]  and  the 
references  there). 

Here,  we  present  new  hp-time-stepping  approaches  based  on  a  /tp-DGFEM 
in  time  [57],  [62].  The  methods  are  single  step  schemes  which  allow  arbitrary 
variation  in  order  r  as  well  as  in  the  time  step  At.  Conceptually,  this  is 
reminiscent  of  the  Runge-Kutta-Fehlberg  approach  to  initial  value  ODEs. 
However,  there  are  important  differences.  The  hp-DGFEM  converge  as  the 
order  r  — >  oo  and  the  time  step  At  >  0  is  fixed.  They  give  spectral  accuracy 
in  transient  problems  with  smooth  time-dependence  and,  in  conjunction  with 
geometric  meshes  and  variable  order  in  time,  give  exponential  convergence 
for  parabolic  evolution  problems  with  piecewise  analytic  (in  time)  solutions 
(which  arise,  e.g.  at  t  —  0  for  incompatible  initial  data  or  for  piecewise  ana¬ 
lytic  forcing  terms). 

Moreover,  they  are  unconditionally  stable  for  parabolic  problems  inde¬ 
pendent  of  the  spatial  discretization.  This  is  crucial,  since  hp-FEM  in  space 
require  highly  anisotropic  meshes  for  efficient  resolution  of  layers  and  fronts 
which  tend  to  produce  very  stringent  CFL  limitations  in  explicit  schemes. 
The  underlying  variational  structure  of  hp-DGFEM  allows  moreover  for  a- 
posteriori  error  estimation  and  adaptivity. 

We  proceed  as  follows:  we  first  elaborate  on  the  hp-DGFEM  and  the  hp- 
SDFEM  for  first  order  hyperbolic  equations  as  e.g.  (1.1).  It  turns  out  that 
the  analysis  in  Chapter  7  applies  directly  here  as  well.  Next,  we  present 
the  hp- DG  time  stepping  technique  from  [57,62]  for  (systems  of  nonlinear) 
parabolic  initial  value  problems.  Finally,  we  apply  this  technique  to  some 
parabolic  model  evolution  problems  and  discuss  convergence  results  as  well 
as  implementation  issues. 

11.1  hp-FEM  for  first  order  transient,  hyperbolic  problems 

In  a  bounded  domain  Cl  C  IRrf  and  for  0  <  t  <  T,  consider  the  unsteady 
linear  advection  problem 

du 

—  +  a  •  Vu  +  bu  =  S  in  (0,  T)  x  Cl  (11.1) 

L/L 

where  a  €  C'1(J7)d,  b  €  C(Cl)  and  S  €  L2(Cl).  This  is  the  transient  variant  of 
(7.1)  and,  in  fact,  a  special  case  of  it:  we  put 

x  :=  (t,x)  <E  Q  :=  (0,T)  x  Cl  C  IRd+1 , 


Cu  :=  a  •  Vu  4-  bu 
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where  a  :=  (1, 01,02, . . . ,  ad)T ,  V  =  (dt,di, . . .  ,dd)-  Then  (11.1)  takes  the 
form  (7.1),  and  the  initial  condition 

u(-,0)=u0  in  D  (11.2) 

becomes  simply  an  “inflow”  boundary  condition  on  J?  x  {t  =  0).  We  may 
therefore  discretize  now  (11.1),  (11.2)  in  IRd+1  as  proposed  in  Section  7  -  the 
resulting  method  will  allow,  in  fact,  arbitrary  combinations  of  space  and  time 
meshes  and  orders,  if  the  /ip-DGFEM  in  Section  7.3  is  used.  For  example, 
Fig.  11.1  shows  a  possible  mesh  in  d  =  1:  here  Q  =  (0, T)  x  (0, 1). 


At 


1 


Fig.  11.1.  Space  time  mesh  for  /ip-DGFEM 

Notice  that  the  element  boundaries  need  not  be  aligned  with  the  (a;,  t)  axes 
-  this  is  essential  if  propagating  perturbations  arising  in  hyperbolic  equations 
are  to  be  tracked  accurately  with  large  time  steps  (see  the  bold  line  in  Fig. 
ll.l). 

Note  also  that  we  still  kept  in  Fig.  11.1  time  levels  -  the  first  order  prob¬ 
lems  can  be  solved  explicitly  by  propagating  information  with  the  flow  a 
through  the  elements.  It  should  also  be  clear  from  Section  7  that  space  and 
time  orders  can  be  varied  independently  here.  We  shall  not  go  into  detailed 
error  estimates  here. 

We  next  present  the  DG(r)  scheme  for  nonlinear  initial  value  ODEs.  This 
is  of  independent  interest  also  for  high  order  MOL  discretizations.  In  the 
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following  subsection  we  address  then  the  combined  space-time  discretization 
of  parabolic  problems. 

11.2  The  DG(r)-FEM  for  nonlinear  initial  value  problems 

Let  J  =  [fo>^o  +  T]  for  T  >  0.  Let  /  :  J  x  IRM  — ►  IRM  be  continuous  and 
u0  €  IRa"  be  given.  Consider  the  IVP 

u'(t)  =  f(t,u(f)),  t  £  J,  u (t0)  =  u0  .  (11.3) 

We  assume  that  f(t,  u)  is  uniformly  Lipschitz  continuous  w.r.  to  u,  i.e.  f 
satisfies 


||f(f,u)  -  f(t,v)||  <  L  ||u- v||  u,v  €  IRM,t  £  J  .  (11-4) 

(11.4)  implies  that  (11.3)  admits  a  unique  solution  u(f)  €  C1  (J;  IR<). 

Let  M.  denote  a  partition  of  J  into  N  timesteps  to  <  <  ■  •  ■  <  f/v-i  < 

tpf  =  to  +  T  and  set  Atn  :=  tn  -  tn-i ,  n  =  1, ...  ,N,  At  =  ma x{Atn  :  1  < 
n  <  N}.  Let  <p  :  J  -4  IRm  be  a  piecewise  continuous  function  on  M.  Then 
we  define  the  one-sided  limits 

:=  lim  <p(tn  ±  s),  0  <  n  <  N  —  1 , 


and  the  jumps 


Mn  =  <Pn~'Pn 


On  the  time-mesh  M,  we  introduce 

C6°(M;  IRM)  :=  {ip  :  J  — >  IRm  M/n  £  C6°(J„;  IRM)} 


of  IRM-valued,  piecewise  continuous  and  bounded  functions.  The  IVP  (11.3) 
admits  the  following  variational  formulation:  find  u  £  C°(M;  IRM)  such  that, 
for  all  ip  £  C°(M;  IRM), 

X  /  <u'(*)  ”  f  (f>  "(*))»  <?(*)> dt  + 

"w  /n  (11-5) 

X  (Mn-l.^n-l)  +  («0  .¥>0  )  =  (uo.V’o  )  • 

n— 2 

DG(r)- discretization.  We  associate  with  each  time  interval  In  a  polynomial 
degree  rn  >  0  and  combine  these  degrees  in  the  vector  r  =  {cnjOLu-  Define 
the  subspace 

V(A4;IRm)  =  {<p  :  J  -»■  IRM  M/„  €  VT"  (Jn;  IRM),  1  <  n<N}.  (11.6) 
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As  before,  if  rn  =  r  for  all  n,  we  write  Vr(M]  IRM).  Then  the  DG(r)-method 
reads:  find  U  €  Vr(Ad;  IRM)  such  that 


<U'(t)-f(i,U(i)),  <p(t))dt  + 


Y  ([U]n_i,^+_1)  +  {Uj,<pJ)  =  (uo,<Po) 

n=2 


(11.7) 


for  all  <p  6  Vr(A4;  IRM). 

Notice  that  (11.7)  is  only  apparently  global  -  owing  to  the  discontinuity 
of  the  ip,  (11.5)  amounts  to  solving  successively  on  In,  n  =  1,2,3,...  the 
problems 

[  (U'(*)-f(*,U(i)),  <p(t))dt+  <U+_a>*>+_i)  =  (u -_1)V+_1>  .  (11.8) 

Jin 

for  all  cp  €  VTn  (/„;  IRM). 

In  each  timestep,  this  is  a  system  of  M(rn  +  1)  nonlinear  equations  for 
the  polynomial  coefficients  of  U|/n . 

To  solve  it,  we  propose  the  fixed  point  iteration : 

Let  U  €  'PTn (/„,  IRm)  be  given.  Then  U  =  TU  is  the  solution  of  the 
nonlinear  problem:  'dip  €  Vrn(In  :  IRM): 


/  <U,(*),V(t))<ft  +  (U+_1,V+_1) 

=  <U-_1>V+_i)  +  J  (i(t,V(t)),ip(t)) 


dt 


(11.9) 


A  fixed  point  U  =  TU  of  (11.9)  solves  (11.8).  We  have  [57]: 

Theorem  11.1.  Let  rn  >  0  be  arbitrary  and  assume  the  CFL-condition 


At  =  max{Atn  N}  < 


(11.10) 


Then  (11.8)  has  a  unique  solution  and  the  fixed  point  iteration 


Um=TU^,  U°=U'_1 

converges. 

Using  a  more  sophisticated  iteration  (eg.  Newton’s  Method),  larger  time 
steps  are  allowable.  The  error  u  —  U  can  be  estimated  as  follows: 


Theorem  11.2.  There  is  c  >  0  independent  of  r  and  M  and  K  such  that 
the  DG( r)  solution  U  €  Vr(Ad;IRM)  o/(11.7)  satisfies. 


u  -  UIWjRM)  -  C(1  +  LT  exP(ciT))  = 

max  ||u  -  V||  2.  irm.  . 


(11.11) 
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If  the  time  steps  Atn  are  monotonically  increasing, 

Hu  ~  UHl=(j,IRm)  -  +  LT  exp(cLT))5||u  -  V||l2(J;,rm)  .  (11.12) 

Here  V  €  Vr{M]  IRM)  is  the  interpolant  on  In  defined  by  V“  :=  u(tn),  and 

f  (V  ,<p)dt  =  f  (u,  ip)  dt  V^£?f"(/n;  IRm).  (11.13) 

Jin  Jin 

Remark  11.3.  The  results  hold  also  when  IRM  is  replaced  by  a  Hilbert-space 
with  norm  ||  o  ||  and  inner  product  (-,-). 

11.3  DG(r)-FEM  for  abstract  initial  boundary  value  problems 

We  will  generalize  the  hp  DG-FEM  to  abstract  parabolic  equations,  including 
convection  dominated  diffusion  and  viscous,  incompressible  flows. 


Abstract  Setting.  Let  X,  H  be  complex,  separable  Hilbert  spaces,  X  t-»-  H 
with  dense  injection  and  norms  ||  •  ||x  and  ||  •  ||# ,  respectively.  Denote  the 
scalar  product  on  H  by  (•,  -)h  and  identify  H  and  H*,  the  antidual  of  H.  We 
get  the  Gelfand  triple 


X  <->  H  K  H*  X* 


(11.14) 


and  write  (•,  -)x*xX  for  the  X*  x  X  duality  pairing  and  ||  •  ||x*  for  the  norm 
in  X*.  Typically,  for  viscous  flow,  we  have  H  =  L2(f2)  and  X  =  Hq{Q). 

Let  J  =  (a,  b)  be  a  time  interval.  Then  the  weak  time  derivative  of  an 
X*-valued  distribution  u  G  V'(J]X*)  is 


J  '(u,v)x*xx<p{t)dt  --  j  (' u(t),v)H<p{t)dt  (11.15) 

for  all  v  €  X,  €  ©(./).  This  time  derivative  has  the  following  properties: 
u  €  L2(J-,X),  u  €  L2(J;  X*)  => 

(11.16a) 

u  e  C([a,b\]H)  u,ve  L2{J\X),u,v  €  L2(J;X*)  => 


{u{t),v{t))H  -  ( u{s),v{s))H 

=  /  (u,v)x*xxdT  +  /  {u,v)x* 
J  8  j  S 


XX 


dr 


(11.16b) 


i(t)\\2H  ~  IKs)llff  =  2f?e  f  (u,u)x*xxdT  Vs,te[a,6].  (11.16c) 

J  S 
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On  the  Gelfand  triple  (11.14)  we  introduce  the  spatial  sesquilinear  form 

a:XxX  ^  C  . 


We  call  the  (possibly  nonsymmetric)  form  a(-,  •)  (a,  /3)-elliptic,  if 
|a(«,u)|  <  a||it||;f  IMIx  Vu,v€X, 

Rea(u,u )  >  /3||u||x  Vu  £  X  . 

The  form  a(-,  ■)  corresponds  to  a  weak  formulation  of  the  differential  operator 
L  :  X  -»  X*  which  could  be  any  of  the  operators  previously  considered. 

Let  now  0  <  T  <  oo  and  J  =  (0,T).  Consider  the  abstract  evolution 
problem: 

given  g  €  L2(J;X*),  uo  G  H,  find 

u(t)  +  Lu{t)  =  g(t )  (11.17a) 


u( 0)  =  u0  ■ 


The  weak  formulation  of  (11.17)  is: 

find  u  £  L2(J;X)  PI  u(0)  =  u0,  such  that 


a(u,v)(p(t)dt 

=  J  {9{t),v)x*xxv{t)dt  VvEX,  V<pG  V(J). 


(11.17b) 


(11.18) 


The  DG(r)  method.  We  discretize  (11.18)  in  time,  reducing  it  to  a  se¬ 
quence  of  spatial  problems  involving  the  form  a(-,  ■)  which  can  be  discretized 
using  the  hp-FEM  in  Chapters  1-10. 

Let  M.  be  a  partition  of  J  —  (0,  T)  into  N  time  intervals 

In  =  [tn- 1  ,f„],  1  <  n  <  N,  0  =  t0  <  h  <  ■  ■  ■  <  tN  =  T  . 
and  set  . —  tn  tn — i  —  )7n|. 

Set  further  At  =  max  Atn.  For  u  :  J  — t  X,  define  the  one-sided  limits 

n 

Un  '■=  lim  it(f„±s),  0<n<N  —  1 
n  0<s— >o  v  n  —  — 

and  ujf  =  limo<3_>o  u(T  —  s).  For  1  <  n  <  N  —  1,  set  [u]n  =  —  u~  and 

introduce  on  M  =  {In}n=i  the  space 

X)  =  {u  :  J  -4  X  |  u  e  C6°(/n;  X)}  . 

By  integration  by  parts  in  time  and  elementary  algebra,  we  obtain 
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Proposition  11.4.  A  weak  solution  u  £  L2(J]X)  fl  of  (11.17) 

satisfies 


N  p  N 

/  (U  +  Lu,v)x*xX  +  '%2([u]n-l,V+_1)H  +  (u£,V$)H  = 
n=  1  Jln  n=2 

(u0,Vq)h  +  j  ( 9,v)x*xxdt 

for  alive  C°(M;X). 


Let  now 


Vr(I-,X)  =  {p:I  ^X  :  p(t)  =  ^  VxjiXj  6  X} 


(11.19) 


Vr(M;X)  =  {u:  J  ^X:u\in  ePr"(/„;X),  1  <  n  <  AT)  . 

Then  the  DG(r)  method  reads: 
find  U  £  Vr(M,X)  such  that 

N  N 

B(U,V)  :=  53  /  (fi,V)x.*xdt  +  53  /  (LU,V)x*xxdt  + 

n=l  •'A  n=1  •'/„ 

N 

53  ([[/]n_1,p+_1)ff  +  ([/0+>v'0+)// 

n=2 

N 

=  (uo,v0+)H  +  '£(g(t),v)x.xXdt 


(11.20) 


for  all  V  €  Vr(A4;X). 

Once  again,  (11.20)  can  be  solved  recursively.  On  each  we  get  an 
elliptic  system  for  rn  +  1  unknown  fields  in  X.  The  structure  of  this  system 
and  the  algorithmic  complexity  of  its  solution  crucially  depend  on  the  basis 
functions  chosen  in  time. 

Let  I  be  any  time  interval  and  let  {<£j}£_0  and  {i/'i}[_0  be  two  bases  of 
Vr(— 1, 1),  and  denote  by  their  transported  variants  on  I. 

Then 

dw  _  2_  d&  #£  _  _2_  #£  ,  . 

dt  ~  At  dt  ’  dt  At  dt  ’  1  j 


We  introduce  the  matrices 


Aij  =  Ajj  +  A\j  =  J  ip'j  ipi  dt  +  <pj(-l)<pi(-l) 
Bij  =  J  ipjipidt 


(11.22) 


(11.23) 
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Then  U,  V  £  VT(I,  X)  can  be  written  as 

r  r 

U  =  J2  V  =  Y,  UjMeX. 

j= 0  i=0 

The  problem  (11.20)  in  time  interval  In  is  equivalent  to  the  elliptic  system: 
find  Uj  £  X  such  that 


n  ^  At  ~  At 

Y,Aij{UhVi)  +  —  BiXUi^i)  =  —ft+ft,  i  =  0,...,rn.  (11.24) 

j=o 

Here  we  defined,  for  V  £  X, 

fi(V)  ■=  (V,jg4>idt)  and  ft  :=  (U~_vV)  i/>i(-l)  . 

Remark  11.5.  If  rn  =  0,  (11.20)  corresponds  to  backward  Euler  and  if  rn  =  1, 
analogs  of  the  Crank-Nicolson  scheme  are  obtained. 


Spectral  decoupling.  The  convection-diffusion  system  (11.24)  is  very  costly 
to  solve,  particularly  in  three  space  dimensions  and  for  rn  >  0.  In  practice, 
it  is  therefore  very  important  that  (11.24)  can,  in  fact,  be  decoupled  into  rn 
independent  equations  of  the  same  type.  This  is  achieved  by  a  clever  selection 
of  the  basis  functions  tpi,  ipi  in  (11.23).  Assume  that  we  have,  for  r  >  0,  an 
(r  +  1)  x  (r  +  1)  matrix  M  such  that 

M_I  AM  =  diag{<7j}[=0,  M"1  BM  =  1  .  (11.25) 

Then,  changing  from  the  unknowns  Ui  to  f/,  by 

r 

Ui  =  J2  (M"%-  Uj  , 

3=0 


(11.24)  decouples  and  becomes: 
for  j  =  0, . . .  r,  find  Uj  £  X  such  that 

<Uj,Vj)  +  {Uj,Vi)  =  fj+^-tfj  (11-26) 

for  all  Vj  £  X,  where  the  fj  are  certain  linear  combinations  of  the  ft. 
We  observe  that  (11.26)  is  an  elliptic  system  with  an  additional  mass  matrix 
added,  completely  analogous  to  the  system  resulting  from  the  backward  Euler 
scheme  for  r  =  0. 

There  remains  the  question,  if  M  in  (11.25)  can  be  found  for  r  >  0.  This 
is  theoretically  open.  In  practice,  up  to  r  =  50,  such  matrices  can  be  found. 
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They  are,  in  all  cases,  complex  as  are  the  aj,  nevertheless,  the  additional 
gain  by  decoupling  the  system  (11.24)  is  worth  the  use  of  complex  arithmetic, 
especially  in  dimension  d  =  3. 

Note  also  that  [62] 

|oy|  ~  r2  as  r  — ►  oo  , 

so  that  the  problems  (11.26)  are  singularly  perturbed  as  At  -¥  0  or  as  r  — >■  oo. 

hp-e rror  estimates.  With  the  hp-DGFEM,  exponential  convergence  rates 
for  the  time  stepping  scheme  (11.20)  can  be  achieved.  We  present  here  one 
result  from  [57].  The  starting  point  is  the  following  abstract  error  estimate, 
valid  for  any  r  and  M . 

Theorem  11.6.  Let  u  £  L2 (J,  X)  n  H1  (J;  X*)  be  the  solution  of  (11.18)  for 
an  elliptic  operator  L  and  let  U  £  Vr(M,X)  be  the  discrete  solution  of  the 
DGFEM  (11.20). 

Let  Iu  £  Vr(M,  X)  be  the  interpolant  of  U  defined  on  each  time  interval 
In  by  1  <n  <N,  by 

f  (u  -  Iu,(p)x*xsdt  —  0  VV  £  Vrn~1(J',X),  (u  -  Iu)~  —  0  in  X  . 

Jin 

Then  there  holds  the  error  estimate 

llU  “  <  2^1  +  —  j  ||tt  -  Iu\\l2(J-,X)  ■ 

11.4  An  example:  Heat-equation 

DG  -  discretization.  Here  L  =  —A  and  X  —  H  =  L2{Q).  On  a 

generic  time  interval  I  =  (a,  b),  At  =  b  —  a,  we  have  to  solve: 
find  U  £  Vt(I\Hq{Q))  such  that 

f  {(U,  V)  +  (V£7,  W )}dt  +  (U(a),  F(o))  = 

1  (11.27) 

jm,V)dt  +  (U-,V(a ))  VV  £  Vr(I :  H^O))  . 

Here  {<pi}l=0  and  {^i)ri= 0  are  two  bases  of  Vr(-1, 1),  and  we  denote  by 
tpiyipi  their  transported  variants  on  I.  Then 

dtpi  _  2  dipi  dipi  _  2  d^fi 

dt  At  dt  ’  dt  At  dt 
Again,  C7,  V  £  Vr(I,  Hq  (!?))  can  be  written  as 

r  r 

u  =  uwv  y=Y,  u>’Vi  €  Fo(^)  • 

j= 0  i= 0 


(11.28) 


Inserting  this  into  (11.27)  yields: 
find  {{7j}t_0  c  Hq  (ft)  such  that 
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E  {[/ tp'ji’idt  +  Vj(a)Ua)}  (UhVi)  +  [  f  ip^dtt]  (VI/,-,  Wf)} 

i,j= 0  Jl 

r 

=  £ 

i= 0 

for  all  (yjr=0  C  J33(rt). 

We  introduce  the  matrices  Aij,  Bij  as  in  (11.22),  (11.23). 

Then  problem  (11.29)  is  the  elliptic  system: 
find  {Uj}rj=0  C  Hq( such  that 

E  {  A?  ui  ~  Y  Bn  Aui  }  =  -y  fi  +  fh  *  =  0,  •  •  • ,  r  (11.30) 
j= o 

where,  for  V  €  Uq(17), 

f!(v)  :=  (v,J  gtpidtj ,  /7(f)  =  (u;,v)U- 1)  • 


{(14,  J  g*l>idt)  +  (UY,Vi)rPi(a)} 


Decoupling.  The  work  for  the  solution  of  the  coupled  system  (11.30)  is 
substantial,  in  particular  in  three  space  dimensions.  Using  the  simultaneous 
diagonalization  of  the  matrices  A  and  B,  however,  there  exists  M  such  that 

M“'AM  =  diag{cri}£_0,  M-1  B M  =  1 


with  complex  c*,  however. 

Then,  changing  bases  Ui  =  M^1  Uj,  (11.30)  decouples  into  r  +  1  scalar 
problems: 


-^AUi  +  aiUi  =  ^-fl+f?^ 

-AUi  +  ^Ui  =  fl  +  ^-J?,  *  =  0, . . . , r  . 


(11.31) 


More  generally,  in  the  context  of  (11.20),  we  get  the  r+ 1  decoupled  problems: 
find  Ui  £  X  such  that 

a(Ui,  V)  +  ^  (Ui,  V)H  =  fl  +  -^tf?  W  ex.  (11.32) 

We  see  that  we  must  solve  in  each  timestep  In  altogether  rn  + 1  indepen¬ 
dent  elliptic  systems  of  reaction-diffusion  type  discussed  in  Section  6  with 
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same  principal  part  and  different  right  hand  side.  This  can  be  done  in  par¬ 
allel  when  each  system  is  assigned  to  one  processor.  Notice  also  that  (11.32) 
is,  for  small  At  or  large  r,  singularly  perturbed,  regardless  of  the  presence  of 
small  viscosity  effects  in  a(-,  ■).  Problem  (11.32)  is  now  in  the  form  consid¬ 
ered  in  Sections  1-10,  and  any  of  the  techniques  there  can  be  used  for  space 
discretization. 


hp-e rror  estimates.  Combining  the  abstract  a-priori  error  estimate  The¬ 
orem  11.6  with  time  regularity  of  the  heat  equation,  we  obtain  exponential 
convergence,  even  if  the  initial  data  uq  does  not  satisfy  any  compatibility 
condition. 

Theorem  11.7.  Consider  the  heat  equation 

ut  -  Au  =  g  in  fl  x  (0,T)  u  =  0  on  dfi  x  (0,T) 

with  initial  data  uq  €  HS(Q)  :=  (L2(0),  2  for  some  0  <  8  <  1/2, 

and  analytic  right  hand  side  g  satisfying 

II 9(l)mma)<Ci\dl  te[0,T],  f’e  lNo. 

Discretize  it  in  time  using  the  hp-DGFEM  on  a  geometric  mesh  Mn%a  with 
n  layers  and  grading  factor  0  <  a  <  1  with  degrees  r^  satisfying,  for  some 

P  >  0, 

ri  =  0,  rj  >  [pj\,  j  =  2,...,n  . 

Then  the  semidiscrete  solution  U  obtained  from  (11.27)  satisfies 

llu  ~  u\\l*(j-,iP0(q))  <  C  exp  {-bn)  <  C'exp(-6M1/2) 

where  M  denotes  the  number  of  spatial  problems  to  be  solved. 

Remark  11.8.  We  emphasize  that  no  compatibility  is  required  for  the  initial 
data  for  Theorem  11.7  to  hold.  Analogous  results  hold  also  in  the  abstract 
setting  of  Section  11.3.  if  the  operator  L  is  the  infinitesimal  generator  of  an 
analytic  semigroup  [62], 
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Abstract.  In  these  lectures  we  present  the  basic  ideas  and  recent  development  in 
the  construction,  analysis,  and  implementation  of  ENO  (Essentially  Non-Oscillatory) 
and  WENO  (Weighted  Essentially  Non-Oscillatory)  schemes  and  their  applications 
to  computational  fluid  dynamics.  ENO  and  WENO  schemes  are  high  order  accu¬ 
rate  finite  difference  or  finite  volume  schemes  designed  for  problems  with  piecewise 
smooth  solutions  containing  discontinuities.  The  key  idea  lies  at  the  approximation 
level,  where  a  nonlinear  adaptive  procedure  is  used  to  automatically  choose  the 
locally  smoothest  stencil,  hence  avoiding  crossing  discontinuities  in  the  interpola¬ 
tion  procedure  as  much  as  possible.  ENO  and  WENO  schemes  have  been  quite 
successful  in  computational  fluid  dynamics  and  other  applications,  especially  for 
problems  containing  both  shocks  and  complicated  smooth  solution  structures,  such 
as  compressible  turbulence  simulations  and  aeroacoustics. 
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1  Introduction 

We  are  concerned  in  these  lectures  about  high  order  finite  difference  and 
finite  volume  schemes  and  their  applications  to  computational  fluid  dynam¬ 
ics.  These  are  schemes  based  on  interpolations  of  discrete  data,  mostly  by 
using  algebraic  polynomials.  The  foundation  of  such  interpolation  is  in  the 
approximation  theory,  that  a  wider  interpolation  stencil  yields  a  higher  order 
of  accuracy,  provided  the  function  being  interpolated  is  smooth  inside  the 
stencil.  Traditional  finite  difference  and  finite  volume  methods  are  based  on 
fixed  stencil  interpolations.  For  example,  to  obtain  an  interpolation  for  cell  i 
to  third  order  accuracy,  the  information  of  the  three  cells  i  —  1,  i  and  i  4-  1 
can  be  used  to  build  a  second  order  interpolation  polynomial.  In  other  words, 
one  always  looks  one  cell  to  the  left,  one  cell  to  the  right,  plus  the  center  cell 
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itself,  regardless  of  where  in  the  domain  one  is  situated.  This  works  well  for 
globally  smooth  problems.  The  resulting  scheme  is  linear  for  linear  PDEs, 
hence  stability  can  be  easily  analyzed  by  Fourier  transforms  (for  the  uniform 
grid  periodic  case).  However,  fixed  stencil  interpolation  of  second  or  higher 
order  accuracy  is  necessarily  oscillatory  near  a  discontinuity,  see  Fig.  3.1, 
left,  in  Sect.  3.  Such  oscillations,  which  are  called  the  Gibbs  phenomena  in 
spectral  methods,  do  not  decay  in  magnitude  when  the  mesh  is  refined.  It 
is  a  nuisance  to  say  the  least  for  practical  calculations,  and  often  leads  to 
numerical  instabilities  in  nonlinear  problems  containing  discontinuities. 

Earlier  attempts  to  eliminate  or  reduce  such  spurious  oscillations  near  dis¬ 
continuities  were  mainly  based  on  two  approaches:  explicit  artificial  viscosity 
and  limiters.  The  first  approach  was  to  add  an  artificial  viscosity.  This  could 
be  tuned  so  that  it  was  large  enough  near  the  discontinuity  to  suppress,  or  at 
least  reduce  the  oscillations,  but  was  small  elsewhere  to  maintain  high-order 
accuracy.  One  disadvantage  of  this  approach  is  that  fine  tuning  of  the  pa¬ 
rameters  controlling  the  artificial  viscosity  is  problem  dependent.  The  second 
approach  was  to  apply  limiters  to  eliminate  the  oscillations.  In  effect,  one  re¬ 
duced  the  order  of  accuracy  of  the  interpolation  near  the  discontinuity  (e.g. 
by  reducing  the  slope  of  a  linear  interpolant,  or  by  using  a  linear  rather  than 
a  quadratic  interpolant  near  the  shock).  By  carefully  designing  such  limiters, 
the  TVD  (total  variation  diminishing)  property  could  be  achieved  for  one  di¬ 
mensional  nonlinear  scalar  problems  or  linear  systems,  and  maximum  norm 
stability  can  be  achieved  for  multi  dimensional  scalar  problems.  Also,  there 
is  usually  no  free  parameters  in  the  limiters  to  tune.  One  disadvantage  of  this 
approach  is  that  accuracy  necessarily  degenerates  to  first  order  near  smooth 
extrema.  This  could  be  fixed  by  using  the  TVB  (total  variation  bounded) 
modifications  to  the  limiter  in  Shu  [85]  and  Cockburn  and  Shu  [18],  but  such 
modifications  are  not  self-similar.  We  will  not  discuss  the  method  of  adding 
explicit  artificial  viscosity  or  the  TVD  limiters  in  these  lectures.  We  refer  the 
readers  to  the  books  by  Sod  [96],  LeVeque  [66]  and  Godlewski  and  Raviart 
[35],  and  the  references  listed  therein. 

ENO  (Essentially  Non-Oscillatory)  schemes  were  first  introduced  by  Harten, 
Engquist,  Osher  and  Chakravarthy  in  1987  [47].  Their  paper  now  has  become 
a  classic  and  has  been  quoted  numerous  times.  The  Journal  of  Computational 
Physics  decided  to  republish  it  as  part  of  the  journal’s  celebration  of  its  30th 
birthday  [88]. 

The  ENO  idea  proposed  in  [47]  seems  to  be  the  first  successful  attempt  to 
obtain  a  self  similar  (i.e.  no  mesh  size  dependent  parameter) ,  uniformly  high 
order  accurate,  yet  essentially  non-oscillatory  interpolation  (i.e.  the  magni¬ 
tude  of  the  oscillations  decays  as  0(Axk)  where  k  is  the  order  of  accuracy) 
for  piecewise  smooth  functions.  The  generic  solution  for  hyperbolic  conser¬ 
vation  laws  is  in  the  class  of  piecewise  smooth  functions.  The  reconstruction 
in  [47]  is  a  natural  extension  of  an  earlier  second  order  version  of  Harten 
and  Osher  [46].  In  [47],  Harten,  Engquist,  Osher  and  Chakravarthy  investi¬ 
gated  different  ways  of  measuring  local  smoothness  to  determine  the  local 
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stencil,  and  developed  a  hierarchy  that  begins  with  one  or  two  cells,  then 
adds  one  cell  at  a  time  to  the  stencil  from  the  two  candidates  on  the  left 
and  right,  based  on  the  size  of  the  two  relevant  Newton  divided  differences. 
Although  there  are  other  reasonable  strategies  to  choose  the  stencil  based  on 
local  smoothness,  such  as  comparing  the  magnitudes  of  the  highest  degree 
divided  differences  among  all  candidate  stencils  and  picking  the  one  with  the 
least  absolute  value,  experience  seems  to  show  that  the  hierarchy  proposed 
in  [47]  is  the  most  robust  for  a  wide  range  of  grid  sizes,  Ax,  both  before  and 
inside  the  asymptotic  regime. 

As  one  can  see  from  the  numerical  examples  in  [47]  and  in  later  papers, 
ENO  schemes  are  indeed  uniformly  high  order  accurate  and  resolve  shocks 
with  sharp  and  monotone  (to  the  eye)  transitions.  ENO  schemes  are  especially 
suitable  for  problems  containing  both  shocks  and  complicated  smooth  flow 
structures,  such  as  those  occurring  in  shock  interactions  with  a  turbulent  flow 
and  shock  interaction  with  vortices. 

Since  the  publication  of  the  original  paper  of  Harten,  Engquist,  Osher  and 
Chakravarthy  [47],  the  original  authors  and  many  other  researchers  have  fol¬ 
lowed  the  pioneer  work,  improving  the  methodology  and  expanding  the  area 
of  its  applications.  ENO  schemes  based  on  point  values  and  TVD  Runge- 
Kutta  time  discretizations,  which  can  save  computational  costs  significantly 
for  multi  space  dimensions,  were  developed  in  Shu  and  Osher  [89],  [90].  Bi¬ 
asing  in  the  stencil  choosing  process  to  enhance  stability  and  accuracy  were 
developed  in  Fatemi,  Jerome  and  Osher  [31]  and  in  Shu  [87].  Finite  volume 
ENO  schemes  based  on  a  staggered  grid  and  Lax-Friedrichs  formulation  were 
given  in  Bianco,  Puppo  and  Russo  [9].  Weighted  ENO  (WENO)  schemes  were 
developed,  using  a  convex  combination  of  all  candidate  stencils  instead  of  just 
one  as  in  the  original  ENO,  Liu,  Osher  and  Chan  [69]  for  ID,  Jiang  and  Shu 
[55]  for  multi  dimensional  finite  difference  formulation  with  improved  accu¬ 
racy,  Friedrich  [32]  for  multi  dimensional  finite  volume  formulation,  Hu  and 
Shu  [49],  [50]  for  multi  dimensional  finite  volume  formulation  with  improved 
accuracy,  and  Levy,  Puppo  and  Russo  [67]  for  ID  finite  volume  based  on  a 
staggered  grid  and  Lax-Friedrichs  formulation.  ENO  schemes  based  on  other 
than  polynomial  building  blocks  were  constructed  in  Iske  and  Soner  [52]  and 
in  Christofi  [17].  Sub-cell  resolution  and  artificial  compression  to  sharpen  con¬ 
tact  discontinuities  were  studied  in  Harten  [44],  Yang  [105],  Shu  and  Osher 
[90]  and  in  Jiang  and  Shu  [55].  Multidimensional  ENO  schemes  based  on  gen¬ 
eral  triangulation  were  developed  in  Abgrall  [1].  ENO  and  WENO  schemes 
for  Hamilton- Jacobi  type  equations  were  designed  and  applied  in  Osher  and 
Sethian  [78],  Osher  and  Shu  [79],  Lafon  and  Osher  [62]  and  in  Jiang  and  Peng 
[57].  ENO  schemes  using  one-sided  Jocobians  for  field  by  field  decomposition, 
which  improves  the  robustness  for  calculations  of  systems,  were  discussed  in 
Donat  and  Marquina  [28].  Combination  of  ENO  with  multiresolution  ideas 
was  pursued  in  Bihari  and  Harten  [10].  Combination  of  ENO  with  spec¬ 
tral  method  using  a  domain  decomposition  approach  was  carried  out  in  Cai 
and  Shu  [11].  On  the  application  side,  ENO  and  WENO  have  been  success- 
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fully  used  to  simulate  shock  turbulence  interactions,  Shu  and  Osher  [90], 
Shu,  Zang,  Erlebacher,  Whitaker  and  Osher  [91],  and  Adams  and  Shariff  [2]; 
to  the  direct  simulation  of  compressible  turbulence,  Shu,  Zang,  Erlebacher, 
Whitaker  and  Osher  [91],  Walsteijn  [102],  and  Ladeinde,  O’Brien,  Cai  and 
Liu  [61];  to  relativistic  hydrodynamics  equations  in  Dolezal  and  Wong  [27]; 
to  shock  vortex  interactions  and  other  gas  dynamics  problems  in  Casper  and 
Atkins  [15],  Erlebacher,  Hussaini  and  Shu  [30],  and  in  Jiang  and  Shu  [55];  to 
incompressible  flow  problems  in  E  and  Shu  [29]  and  Harabetian,  Osher  and 
Shu  [40];  to  viscoelasticity  equations  with  fading  memory  in  Shu  and  Zeng 
[92];  to  semi-conductor  device  simulation  in  Fatemi,  Jerome  and  Osher  [31] 
and  Jerome  and  Shu  [53],  [54];  to  image  processing  in  Osher  and  Sethian 
[78],  Sethian  [84],  and  Siddiqi,  Kimia  and  Shu  [93];  etc.  This  list  is  definitely 
incomplete  and  perhaps  biased  by  the  author’s  own  research  experience,  but 
one  can  already  see  that  ENO  and  WENO  have  been  applied  quite  extensively 
in  many  different  fields.  Most  of  the  problems  solved  by  ENO  and  WENO 
schemes  are  of  the  type  in  which  solutions  contain  both  strong  shocks  and 
rich  smooth  region  structures.  Lower  order  methods  usually  have  difficulties 
for  such  problems  and  it  is  thus  attractive  and  efficient  to  use  high  order 
stable  methods  such  as  ENO  and  WENO  to  handle  them. 

Today  the  study  and  application  of  ENO  and  WENO  schemes  are  still 
very  active.  We  expect  the  schemes  and  the  basic  methodology  to  be  devel¬ 
oped  further  and  to  become  even  more  successful  in  the  future. 

In  these  lectures  we  present  the  basic  ideas  and  recent  development  in  the 
construction,  analysis,  and  implementation  of  ENO  and  WENO  schemes  and 
their  applications  to  computational  fluid  dynamics.  For  readers  interested  in 
coding  the  methods,  sample  codes  are  available  from  the  author. 

2  Reconstruction  and  Approximation  in  One 
Dimension 

This  section  gives  the  necessary  background  information  about  polynomial 
interpolation  and  approximation  in  one  space  dimension. 

Given  a  grid 


a  =  xi  <  X3  <  ...  <  <  xn+±  =  b, 

We  define  cells,  cell  centers,  and  cell  sizes  by 


r  1 

if  \ 

Xi=  2  +xi+i) 

Axi  =  xi+i  —  Xj_i,  i  =  1,2,..., AT. 
We  denote  the  maximum  cell  size  by 


Ax  =  max  Axs . 

l<i<N 


(2.1) 


(2.2) 

(2.3) 
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2.1  Reconstruction  from  Cell  Averages 

The  first  approximation  problem  we  will  face,  in  solving  hyperbolic  conser¬ 
vation  laws  using  cell  averages  (finite  volume  schemes,  see  Sect.  4.1),  is  the 
following  reconstruction  problem  [47]. 

Problem  2.1.  One  dimensional  reconstruction. 

Given  the  cell  averages  of  a  function  v(x): 

«i  =  ^/  *  =  1,2,  ...,1V,  (2.4) 

find  a  polynomial  Pi(x),  of  degree  at  most  k  —  1,  for  each  cell  J,;,  such  that  it 
is  a  A;-th  order  accurate  approximation  to  the  function  v(x)  inside  If. 

Pi(x)  =  v{x)  +  0{Axk),  x  e  If  i  =  l,...,N.  (2.5) 

In  particular,  this  gives  approximations  to  the  function  v(x)  at  the  cell  bound¬ 
aries 

v~+i  =Pi(xi+i),  v+l^piix^i),  1  =  1, ...,  N  (2.6) 
which  are  fc-th  order  accurate: 

=v(xi+t)  +  0{Axk),  v+_i=v(Xi_i)  +  0(Axk),  i  =  1,  ...,1V.  (2.7) 

□ 

The  polynomial  Pi(x)  in  Problem  2.1  can  be  replaced  by  other  simple 
functions,  such  as  trigonometric  polynomials.  See  Sect.  8.3. 

We  will  not  discuss  boundary  conditions  in  this  section.  We  thus  assume 
that  Vi  is  also  available  for  i  <  0  and  1  >  IV  if  needed. 

In  the  following  we  describe  a  procedure  to  solve  Problem  2.1. 

Given  the  location  and  the  order  of  accuracy  k,  we  first  choose  a  “sten¬ 
cil”  ,  based  on  r  cells  to  the  left,  s  cells  to  the  right,  and  ij  itself  if  r,  s  >  0, 
with  r  +  s  +  1  =  k: 

S(i)  =  {Ii-T,...,Ii+s}  .  (2.8) 

There  is  a  unique  polynomial  of  degree  at  most  k  —  1  =  r  +  s,  denoted  by 
p(x)  (we  will  drop  the  subscript  i  when  it  does  not  cause  confusion),  whose 
cell  average  in  each  of  the  cells  in  S(i)  agrees  with  that  of  v(x): 


This  polynomial  p(x)  is  the  fc-th  order  approximation  we  are  looking  for,  as 
it  is  easy  to  prove  (2.5),  see  the  discussion  below,  as  long  as  the  function  v(x) 
is  smooth  in  the  region  covered  by  the  stencil  S(i). 
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For  solving  Problem  2.1,  we  also  need  the  approximations  to  the  values 
of  v(x)  at  the  cell  boundaries,  (2.6).  Since  the  mappings  from  the  given  cell 
averages  Vj  in  the  stencil  S(i)  to  the  values  v7+1  and  vt_l  in  (2.6)  are  linear, 
there  exist  constants  crj  and  crj-,  which  depend  on  the  left  shift  r  of  the 
stencil  S(i)  in  (2.8),  on  the  order  of  accuracy  k,  and  on  the  cell  sizes  Axj  in 
the  stencil  Si,  but  not  on  the  function  v  itself,  such  that 

k- 1  fc-1 

^  ]  crjvi—r+jj  «+i  =  y  '  OrjVj—r+j •  (2-10) 

3=0  2  j=0 

We  note  that  the  difference  between  the  values  with  superscripts  ±  at  the 
same  location  xi+ i  is  due  to  the  possibility  of  different  stencils  for  cell  Ii  and 
for  cell  Ij+i .  If  we  identify  the  left  shift  r  not  with  the  cell  7,  but  with  the 
point  of  reconstruction  xi+i,  i.e.  using  the  stencil  (2.8)  to  approximate  xi+i, 
then  we  can  drop  the  superscripts  ±  and  also  eliminate  the  need  to  consider 
crj  in  (2.10),  as  it  is  clear  that 

Crj  —  Cr—lJ' 

We  summarize  this  as  follows:  given  the  k  cell  averages 

Vi—r>  •••)  ‘Oi— r+k— 1> 

there  are  constants  crj  such  that  the  reconstructed  value  at  the  cell  boundary 
*<+i: 

fc-i 

vi+ 1  =  /  '  CrjVi-r+j ,  (2.11) 

3= 0 

is  fc-th  order  accurate: 

vi+i  =v(xi+i)  +  0{Axk).  (2.12) 

To  understand  how  the  constants  {crj}  are  obtained,  as  well  as  how  the 
accuracy  property  (2.5)  is  proven,  we  look  at  the  primitive  function  of  v(x): 

V(x)=  [X  v(Od£,  (2.13) 

where  the  lower  limit  —  oo  is  not  important  and  can  be  replaced  by  any  fixed 
number.  Clearly,  V(xi+i)  can  be  expressed  by  the  cell  averages  of  v(x)  using 
(2.4): 

V(xi+i)=  [  3+i  v(0dZ  =  VjAxj,  (2.14) 

i=-oo  j=- oo 
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thus  with  the  knowledge  of  the  cell  averages  {vj}  we  also  know  the  primitive 
function  V  (x)  at  the  cell  boundaries  exactly.  If  we  denote  the  unique  polyno¬ 
mial  of  degree  at  most  k,  which  interpolates  V(xj+ 1)  at  the  following  k  +  1 
points: 

xi-r-\i  “•)  xi+s+%i  (2-15) 

by  P(x),  and  denote  its  derivative  by  p(x): 

p(x)  =  P'(x) ,  (2.16) 

then  it  is  easy  to  verify  (2.9): 


=  V 


3i 


-i 

j  =  i  —  r,  ...,i  +  s, 


where  the  third  equality  holds  because  P(x)  interpolates  V  (x)  at  the  points 
Xj_i  and  Xj+ 1  whenever  j  =  i  —  +  s.  This  implies  that  p{x)  is  the 

polynomial  we  are  looking  for.  Standard  approximation  theory  (see  any  ele¬ 
mentary  numerical  analysis  book)  tells  us  that 

P'(x)  =  V\x)  +  0(Axk),  x  £  p. 


This  is  the  accuracy  requirement  (2.5). 

Now  let  us  look  at  the  practical  issue  of  how  to  obtain  the  constants 
{crj}  in  (2.11).  For  this  we  could  use  the  Lagrange  form  of  the  interpolation 
polynomial: 


P(x)  =  £  V(i 


i— r+m- 


4) 


m= 0 


n 

i  =  o 

l  7^  m 


x  -  x  „• 


-r+l- 


-r+m- 


l  —  X. 


i-r+l- 


(2.17) 


For  easier  manipulation  we  subtract  a  constant  V(xi_r_ a)  from  (2.17),  and 
use  the  fact  that 


E  n 


m=0  l  =  0 


X  Xi_r+i_l 
x i—r+m —  j  —  ®t— r+l—  5 


l  ±  m 
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to  obtain: 


ELo  (v<i 


-r+m- 


P(x)-V(Xl_r_.J  = 

l  ^  m 


r+;-^ 


C .  ,  1  - 

*  — r+m—  £ 


-(2.18) 


Taking  derivative  on  both  sides  of  (2.18),  and  noticing  that 


m— 1 

^(^i-r+m-^)  —  V(.xi— r-|)  =  vi-r+jAxi-r+j 

3=0 

because  of  (2.14),  we  obtain 


k  m— 1 

P{x)  =  ^  ^  ^  1  r+jAXi—r+j 

m= 0  j=0 


(E;  =  o  n‘g  =  „  (*-*(-,+,-l)) 

l  ^  m  q  ^  m,l 

n*  =  0  (^t-r+m-i  -  Zi-r+i-i) 


) 

(2.19) 


Evaluating  the  expression  (2.19)  at  x  =  a++i,  we  finally  obtain 
vi+h  =p(xi+i) 

°i+  5  '_a:i-r+g-|) 


=  0  n  q  =  o  (xi 
l  to  q  ^  m,l 


k— 1  fc 

=  Z\xj_r+j  Ui_r+j  / 

i=o  m=j+i  ni  =  0  V 

l  m 

i.e.  the  constants  crj  in  (2.11)  are  given  by 


'*— r+m—  i 


®i— r+l—  5) 


Crj  —  Zlxj—r+j  'y  ' 


2^i  =  0  n  qr  =  0  (x*+|  Xi-r+9-i) 

l  m  q  ^  m,l 


—3+ 1  II/  =  0  r+m— 5  r+(— 

l  ^  m 


(2.20) 


Although  there  are  many  zero  terms  in  the  inner  sum  of  (2.20)  when  xi+i 
is  a  node  in  the  interpolation,  we  will  keep  this  general  form  so  that  it  applies 
also  to  the  case  where  xi+i  is  not  an  interpolation  point. 

For  a  nonuniform  grid,  one  would  want  to  pre-compute  the  constants 
{crj}  as  in  (2.20),  for  0  <  i  <  N,  —  1  <  r  <  k  —  1,  and  0  <  j  <  k  —  1,  and 
store  them  before  solving  the  PDE. 
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For  a  uniform  grid,  Axi  =  Ax,  the  expression  for  crj  does  not  depend  on 
i  or  Ax  any  more: 


crj  —  ^ 

m=j+ 1 


£|  =  0  n%  =  0  (r-9  +  l) 

l  m  q  ^  m,l 

ri/=0  (m_/) 

l  ^  m 


(2.21) 


We  list  in  Table  2.1  the  constants  crj  in  this  uniform  grid  case  (2.21),  for 
order  of  accuracy  between  k  =  1  and  k  =  6. 


Table  2.1.  The  constants  cTj  in  (2.21). 


IB 

r 

j=o 

1=1 

j=2 

j=3 

j=4 

j=5 

i 

-i 

1 

0 

1 

2 

-1 

3/2 

-1/2 

0 

1/2 

1/2 

1 

— 

— 

3 

-1 

1  11/6 

-7/6 

1/3 

Eg 

■B 

5/6 

-1/6 

M 

-1/6 

5/6 

1/3 

!Bi 

1/3 

-7/6 

11/6 

4 

-i 

25/12 

-23/12 

13/12 

-1/4 

0 

1/4 

13/12 

SE&K 

1/12 

1 

-1/12 

7/12 

11 

HH 

2 

1/12 

-5/12 

13/12 

1/4 

3 

-1/4 

13/12 

-23/12 

25/12 

5 

-1 

137/60 

-163/60 

137/60 

-21/20 

1/5 

1/5 

77/60 

17/60 

-1/20 

1 

-1/20 

9/20 

47/60 

-13/60 

1/30 

2 

1/30 

47/60 

-1/20 

3 

-1/20 

17/60 

77/60 

1/5 

4 

1/5 

-21/20 

137/60 

-163/60 

137/60 

6 

-1 

49/20 

-71/20 

79/20 

-163/60 

31/30 

-1/6 

1/6 

29/20 

-21/20 

37/60 

-13/60 

1/30 

1 

-1/30 

11/30 

19/20 

-23/60 

7/60 

-1/60 

2 

1/60 

-2/15 

37/60 

37/60 

-2/15 

1/60 

3 

-1/60 

-23/60 

19/20 

11/30 

-1/30 

4 

1/30 

-13/60 

37/60 

-21/20 

29/20 

1/6 

5 

-1/6 

31/30 

79/20 

-71/20 

49/20 
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From  Table  2.1,  we  would  know,  for  example,  that 
1  5  1 

Vi+i  =  --Ui-1  +  -Vi  +  -vi+ 1  +  0(Ax3) . 

2  O  DO 

2.2  Conservative  Approximation  to  the  Derivative  from  Point 
Values 

The  second  approximation  problem  we  will  face,  in  solving  hyperbolic  con¬ 
servation  laws  using  point  values  (finite  difference  schemes,  see  Sect.  4.2),  is 
the  following  problem  in  obtaining  high  order  conservative  approximation  to 
the  derivative  from  point  values  [89,90]. 

Problem  2.2.  One  dimensional  conservative  approximation. 

Given  the  point  values  of  a  function  v{x)\ 

Vi=v(xi),  i  =  1,2,  ...,1V,  (2.22) 

find  a  numerical  flux  function 

Vi+  i  =  v(Vi—r, ...,  Uj-|-S),  i  0, 1, ...,  IV ,  (2.23) 

such  that  the  flux  difference  approximates  the  derivative  v'(x)  to  &-th  order 
accuracy: 

~  («i+i  -  =  v'(Xi)  +  0(Axk),  *  =  0,1, ...,  N.  (2.24) 

□ 


We  again  ignore  the  boundary  conditions  here  and  assume  that  Vi  is 
available  for  i  <  0  and  i  >  IV  if  needed. 


The  solution  of  this  problem  is  essential  for  the  high  order  conservative 
schemes  based  on  point  values  (finite  difference)  rather  than  on  cell  averages 
(finite  volume). 

This  problem  looks  quite  different  from  Problem  2.1.  However,  we  will 
see  that  there  is  a  close  relationship  between  these  two.  We  assume  that  the 
grid  is  uniform,  Axi  =  Ax.  This  assumption  is,  unfortunately,  essential  in 
the  following  development. 

If  we  can  find  a  function  h(x),  which  may  depend  on  the  grid  size  Ax, 


such  that 


i  r+^ 

v(x)=AiL^  hlm’ 


then  clearly 
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hence  all  we  need  to  do  is  to  use 


vi+i  =  h(xi+i)  +  0(Axk) 


(2.26) 


to  achieve  (2.24).  We  note  here  that  it  would  look  like  an  0(Axk+1)  term  in 
(2.26)  is  needed  in  order  to  get  (2.24),  due  to  the  Ax  term  in  the  denominator. 
However,  in  practice,  the  0(Axk)  term  in  (2.26)  is  usually  smooth,  hence  the 
difference  in  (2.24)  would  give  an  extra  O(Ax),  just  to  cancel  the  one  in  the 
denominator. 

It  is  not  easy  to  approximate  h(x)  via  (2.25),  as  it  is  only  implicitly 
defined  there.  However,  we  notice  that  the  known  function  v(x )  is  the  cell 
average  of  the  unknown  function  h(x),  so  to  find  h(x)  we  just  need  to  use 
the  reconstruction  procedure  described  in  Sect.  2.1.  If  we  take  the  primitive 
of  h{x): 


(2.27) 


then  (2.25)  clearly  implies 


H(xi+  ,)=  Y,  r+i  MO#  =  Ax  Y  (2-28) 

j=-oo  xi-\  j=-oo 


Thus,  given  the  point  values  {vj},  we  “identify”  them  as  cell  averages  of 
another  function  h(x)  in  (2.25),  then  the  primitive  function  H(x)  is  exactly 
known  at  the  cell  interfaces  x  =  xi+L.  We  thus  use  the  same  reconstruc¬ 
tion  procedure  described  in  Sect.  2.1,  to  get  a  k- th  order  approximation  to 
h(xi+  i),  which  is  then  taken  as  the  numerical  flux  Vi+  i  in  (2.23). 

In  other  words,  if  the  “stencil”  for  the  flux  vi+i  in  (2.23)  is  the  following 
k  points: 

Xi—n  ...,  %i+s  >  (2.29) 

where  r  +  s  =  k  -  1,  then  the  flux  vi+i  is  expressed  as 


k- 1 

Dj+I  =  CrjVj-r+j I  (2.30) 

j= 0 

where  the  constants  {crj}  are  given  by  (2.21)  and  Table  2.1. 

From  Table  2.1  we  would  know,  for  example,  that  if 

1  5  1 

vi+i  =  + -Ui  + -ui+i , 

then 

(vi+i  -  =  v'(xi)  +  0(Ax3). 

We  emphasize  again  that,  unlike  in  the  reconstruction  procedure  in  Sect.  2.1, 
here  the  grid  must  be  uniform:  Axj  —  Ax.  Otherwise,  it  can  be  proven  that 
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no  choice  of  constants  crj  in  (2.30)  (which  may  depend  on  the  local  grid  sizes 
but  not  on  the  function  v(x))  could  make  the  conservative  approximation  to 
the  derivative  (2.24)  higher  than  second  order  accurate  (k  >  2).  The  proof  is 
a  simple  exercise  of  Taylor  expansions.  Thus,  the  high  order  finite  difference 
(third  order  and  higher)  discussed  in  these  lecture  notes  can  apply  only  to 
uniform  or  smoothly  varying  grids. 

Because  of  this  equivalence  of  obtaining  a  conservative  approximation 
to  the  derivative  (2.23)-(2.24)  and  the  reconstruction  problem  discussed  in 
Sect.  2.1,  we  will  only  need  to  consider  the  reconstruction  problem  in  the 
following  sections. 


2.3  Fixed  Stencil  Approximation 


By  fixed  stencil,  we  mean  that  the  left  shift  r  in  (2.8)  or  (2.29)  is  the  same 
for  all  locations  i.  Usually,  for  a  globally  smooth  function  v(x),  the  best 
approximation  is  obtained  either  by  a  central  approximation  r  =  s  —  1  for 
even  k  (here  central  is  relative  to  the  location  xi+i ),  or  by  a  one  point  upwind 
biased  approximation  r  =  s  or  r  =  s  —  2  for  odcf  k.  For  example,  if  the  grid 
is  uniform  Axi  =  Ax,  then  a  central  4th  order  reconstruction  for  vi+i,  in 
(2.11),  is  given  by 

1  _  7  _  7  _  1  _  „ .  4. 

vi+L  =  —  — Uj_i  +  —  Vi  +  —  W<+1  -  j2Vi+2  +  ™Ax  >  ’ 

and  the  two  one  point  upwind  biased  3rd  order  reconstructions  for  vi+x  in 
(2.11),  are  given  by 

1  5  1 

Vi+x  =  --Vi-1  +  g  Vi  +  -vi+1  +  0(Ax3) 

1-  5_  1_  3, 

or  vi+i  =  -Vi  +  -ui+1  -  -vi+2  +  0{Ax  ) . 

Similarly,  a  central  4th  order  flux  (2.30)  is 


Vi+i  =  -^Vi_!  + 


+  ^Vi+1 


12 


Vi+2 


which  gives 

(vi+x  -  vi)  =  v\Xi)  +  0(Ar4), 

and  the  two  one  point  upwind  biased  3rd  order  fluxes  (2.30)  are  given  by 

1  5  1 

Vi+i  -  +  g Vi  +  3^+1 

1  5  1 

or  vi+i  =  -Vi  +  -vi+1  -  -vi+2  , 

which  gives 

2^  (v»+i  -  Vi)  =  +  0{Ax3). 
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Traditional  central  and  upwind  schemes,  either  finite  volume  or  finite 
difference,  can  be  derived  by  these  fixed  stencil  reconstructions  or  flux  differ¬ 
enced  approximations  to  the  derivatives. 

3  ENO  and  WENO  Reconstruction  and  Approximation 
in  One  Dimension 

In  the  previous  section  we  are  mainly  concerned  with  the  approximation 
result  when  the  stencil  is  chosen  and  fixed.  In  this  section  we  will  mainly 
discuss  the  issue  of  how  to  choose  the  stencils. 

For  solving  hyperbolic  conservation  laws,  we  are  interested  in  the  class  of 
piecewise  smooth  functions.  These  are  functions  which  have  as  many  deriva¬ 
tives  as  the  scheme  calls  for,  everywhere  except  for  at  finitely  many  isolated 
points.  At  these  finitely  many  discontinuity  points,  the  function  v(x )  and  its 
derivatives  are  assumed  to  have  finite  left  and  right  limits.  Such  functions  are 
“generic”  for  solutions  to  hyperbolic  conservation  laws,  in  the  sense  that  in 
applications  we  mostly  encounter  such  functions. 

For  such  piecewise  smooth  functions,  the  order  of  accuracy  we  refer  to 
in  these  lecture  notes  are  formal,  that  is,  it  is  defined  as  whatever  accu¬ 
racy  determined  by  the  local  truncation  error  in  the  smooth  regions  of  the 
function.  This  is  the  tradition  taken  in  the  literature  when  discussing  about 
discontinuous  solutions. 

If  the  function  v(x)  is  only  piecewise  smooth,  a  fixed  stencil  approxi¬ 
mation  described  in  Sect.  2.3  may  not  be  adequate  near  discontinuities. 
Fig.  3.1  (left)  gives  the  4-th  order  (piecewise  cubic)  interpolation  with  a 
central  stencil  for  the  step  function,  i.e.  the  polynomial  approximation  in¬ 
side  the  interval  [xj_i,xi+i]  interpolates  the  step  function  at  the  four  points 
Xi_  a ,  i ,  xi+ 1 ,  xi+  3 .  Notice  the  obvious  over /undershoots  for  the  cells  near 
the  discontinuity. 

These  oscillations  (termed  the  Gibbs  Phenomena  in  spectral  methods) 
happen  because  the  stencils,  as  defined  by  (2.15),  actually  contain  the  dis¬ 
continuous  cell  for  Xi  close  enough  to  the  discontinuity.  As  a  result,  the  ap¬ 
proximation  property  (2.5)  is  no  longer  valid  in  such  stencils. 

3.1  ENO  Approximation 

A  closer  look  at  Fig.  3.1  (left)  motivates  the  idea  of  “adaptive  stencil” ,  namely, 
the  left  shift  r  changes  with  the  location  Xi.  The  basic  idea  is  to  avoid  in¬ 
cluding  the  discontinuous  cell  in  the  stencil,  if  possible. 

To  achieve  this  effect,  we  need  to  look  at  the  Newton  formulation  of  the 
interpolation  polynomial. 

We  first  review  the  definition  of  the  Newton  divided  differences.  The  0-th 
degree  divided  differences  of  the  function  V(x)  in  (2.13)-(2.14)  are  defined 
by: 


(3.1) 
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Fig.  3.1.  Fixed  central  stencil  cubic  interpolation  (left)  and  ENO  cubic  interpola¬ 
tion  (right)  for  the  step  function.  Solid:  exact  function;  Dashed:  interpolant  piece- 
wise  cubic  polynomials. 


and  in  general  the  j-th  degree  divided  differences,  for  j  >  1,  are  defined 
inductively  by 

V[Xi_h,...,xi+j_i}=  1+2  -  ,+J  -  2  2J  .  (3.2) 

Similarly,  the  divided  differences  of  the  cell  averages  v  in  (2.4)  are  defined  by 


v{Xi]  =  Vi] 


and  in  general 


^  ■]  _  ?  •••>  V \Xi  j  ... , 

V[Xij  Xi+j J  ==  . 


We  note  that,  by  (2.14), 


V[*i-i  ,*i+i]  = - =  ' 


X,- ,  I  —  x4_i 


i.e.  the  0-th  degree  divided  differences  of  v  are  the  first  degree  divided  dif¬ 
ferences  of  V(x).  We  can  then  write  the  divided  differences  of  V (x)  of  first 
degree  and  higher  in  terms  of  v,  using  (3.5)  and  (3.2),  thus  completely  avoid 
the  computation  of  V. 

The  Newton  form  of  the  fc-th  degree  interpolation  polynomial  P(x),  which 
interpolates  V (x)  at  the  fc+1  points  (2.15),  can  be  expressed  using  the  divided 
differences  (3.1)-(3.2)  by 


k 

P(x)  —  (^x  —  Xi_r+m_iSj  .  (3.6) 
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We  can  take  the  derivative  of  (3.6)  to  get  p(x)  in  (2.16): 
k  j- 1  j- 1 

p{x)  =  ^V{xi_r_±,...,xi_r+j_i]Y^  n  (3-7) 

j=l  ™=°  l  =  o 

l  ^  m 


Notice  that  only  first  and  higher  degree  divided  differences  of  V  (x)  appear 
in  (3.7).  Hence  by  (3.5),  we  can  express  p(x)  completely  by  the  divided  dif¬ 
ferences  of  v,  without  any  need  to  reference  V  (x). 

Let  us  now  recall  an  important  property  of  divided  differences: 


V[Xi_ 


vU)(0 


(3.8) 


for  some  £  inside  the  stencil:  xt_i  <  £  <  xi+j_i,  as  long  as  the  function 
V  ( x )  is  smooth  in  this  stencil.  If  V  (x)  is  discontinuous  at  some  point  inside 
the  stencil,  then  it  is  easy  to  verify  that 


F[xi_i,...,xi+j_i]  =  O 


(3.9) 


Thus  the  divided  difference  is  a  measurement  of  the  smoothness  of  the  func¬ 
tion  inside  the  stencil. 

We  now  describe  the  ENO  idea  by  using  (3.6).  Suppose  our  job  is  to  find 
a  stencil  of  fc  +  1  consecutive  points,  which  must  include  xt_x  and  xi+i,  such 
that  V (x)  is  “the  smoothest”  in  this  stencil  comparing  with  other  possible 
stencils.  We  perform  this  job  by  breaking  it  into  steps,  in  each  step  we  only 
add  one  point  to  the  stencil.  We  thus  start  with  the  two  point  stencil 


S2(i)  =  {Xi_i,xi+ij, 


(3.10) 


where  we  have  used  S  to  denote  a  stencil  for  the  primitive  function  V.  Notice 
that  the  stencil  S  for  V  has  a  corresponding  stencil  S  for  v  through  (3.5),  for 
example  (3.10)  corresponds  to  a  single  cell  stencil 


S(i)  -  {h} 

for  v.  The  linear  interpolation  on  the  stencil  S2(i)  in  (3.10)  can  be  written  in 
the  Newton  form  as 

P^x)  =V[xi_L]  +  V[xi_i,xi+i\  (x-x^x^j  . 

At  the  next  step,  we  have  only  two  choices  to  expand  the  stencil  by  adding 
one  point:  we  can  either  add  the  left  neighbor  Xj_a ,  resulting  in  the  following 
quadratic  interpolation 

R(x)  =  P1(x)  +  ,xi+x]  (x  -  Xi_ i)  (x  -  ,  (3.11) 
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or  add  the  right  neighbor  xi+3,  resulting  in  the  following  quadratic  interpo¬ 
lation 

S(x)  =  P1(x)+V[xi_ i,xi+i,xi+i](x-xi_i')  (x-xi+±y  (3.12) 

We  note  that  the  deviations  from  P1(x)  in  (3.11)  and  (3.12),  are  the  same 
function 


multiplied  by  two  different  constants 

and  V[Xi_i  ,xi+i  ,xi+|].  (3.13) 

These  two  constants  are  the  two  second  degree  divided  differences  of  V  (x) 
in  two  different  stencils.  We  have  already  noticed  before,  in  (3.8)  and  (3.9), 
that  a  smaller  divided  difference  implies  the  function  is  “smoother”  in  that 
stencil.  We  thus  decide  upon  which  point  to  add  to  the  stencil,  by  comparing 
the  two  relevant  divided  differences  (3.13),  and  picking  the  one  with  a  smaller 
absolute  value.  Thus,  if 

<  |l/[a:i-i,xi+i,xi+|]  ,  (3.14) 

we  will  take  the  3  point  stencil  as 

S3{i)  =  {*,_!,«<_£  ,xi+i}; 

otherwise,  we  will  take 

^(*)  =  {^i-i  )  xi+% ,  *<+§  }• 

This  procedure  can  be  continued,  with  one  point  added  to  the  stencil  at 
each  step,  according  to  the  smaller  of  the  absolute  values  of  the  two  relevant 
divided  differences,  until  the  desired  number  of  points  in  the  stencil  is  reached. 

We  note  that,  for  the  uniform  grid  case  Axi  =  Ax,  there  is  no  need  to 
compute  the  divided  differences  as  in  (3.2).  We  should  use  undivided  differ¬ 
ences  instead: 

V  <Zi-i,zi+r  >=  F[xj_i,xi+i]  =  Vi  (3.15) 

(see  (3.5)),  and 

V  <Xi_x,...,xi+j+i  >  (3.16) 

=  V  <xi+i_,...,xi+j+i  >-V  <Xi_ i,...,xi+j_i  >,  j  >  1. 

The  Newton  interpolation  formulae  (3.6)-(3.7)  should  also  be  adjusted  ac¬ 
cordingly.  This  both  saves  computational  time  and  reduces  round-off  effects. 
The  FORTRAN  program  for  this  ENO  choosing  process  is  very  simple: 
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*  assuming  the  m-th  degree  divided  (or  undivided)  differences 

*  of  V(x),  with  x_i  as  the  left-most  point  in  the  arguments, 

*  are  stored  in  V(i,m),  also  assuming  that  "is"  is  the 

*  left-most  point  in  the  stencil  for  cell  i  for  a  k-th  degree 

*  polynomial 

is=i 

do  m=2,k 

if (abs(V(is-l,m)) .lt.abs(V(is,m)))  is=is-l 
enddo 

Once  the  stencil  S(i),  hence  S(i),  in  (2.8)  is  found,  one  could  use  (2.11), 
with  the  prestored  values  of  the  constants  crj,  (2.20)  or  (2.21),  to  compute 
the  reconstructed  values  at  the  cell  boundary.  Or,  one  could  use  (2.30)  to 
compute  the  fluxes.  An  alternative  way  is  to  compute  the  values  or  fluxes 
using  the  Newton  form  (3.7)  directly.  The  computational  cost  is  about  the 
same. 

We  summarize  the  ENO  reconstruction  procedure  in  the  following 

Algorithm  3.1.  ID  ENO  reconstruction. 

Given  the  cell  averages  {«,}  of  a  function  v(x),  we  obtain  a  piecewise  poly¬ 
nomial  reconstruction,  of  degree  at  most  k  -  1,  using  ENO,  in  the  following 
way: 

1.  Compute  the  divided  differences  of  the  primitive  function  V(x),  for  de¬ 
grees  1  to  k,  using  v,  (3.5)  and  (3.2). 

If  the  grid  is  uniform  Axi  =  Ax,  at  this  stage,  undivided  differences 
(3.15)-(3.16)  should  be  computed  instead. 

2.  In  cell  Jj,  start  with  a  two  point  stencil 

&(*)  =  iX } 

for  V(x),  which  is  equivalent  to  a  one  point  stencil, 

Si(i)  =  {Ii} 

for  v. 

3.  For  l  =  2, ...,  k,  assuming 

Si{i)  =  {xj+i,...,xj+l_i} 

is  known,  add  one  of  the  two  neighboring  points,  Xj_i  or  Xj+l+ 1 ,  to  the 
stencil,  following  the  ENO  procedure: 

-  If 

add  Xj_ i  to  the  stencil  Si(i)  to  obtain 

=  {xj_L,...,xj+l_i}] 


(3.17) 
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-  Otherwise,  add  xJ+/+i  to  the  stencil  Si(i)  to  obtain 
5t+i(*)  =  {xj+%> 

4.  Use  the  Lagrange  form  (2.19)  or  the  Newton  form  (3.7)  to  obtain  Pi(x), 
which  is  a  polynomial  of  degree  at  most  k- 1  in  p,  satisfying  the  accuracy 
condition  (2.5),  as  long  as  v(x)  is  smooth  in  p. 

We  could  use  Pi(x)  to  get  the  approximations  at  the  cell  boundaries: 

«£*=»(*<-*)• 

However,  it  is  usually  more  convenient,  when  the  stencil  is  known,  to  use 
(2.10),  with  crj  defined  by  (2.20)  for  a  nonuniform  grid,  or  by  (2.21)  and 
Table  2.1  for  a  uniform  grid,  to  compute  an  approximation  to  v(x)  at  the 
cell  boundaries. 

□ 

For  the  same  piecewise  cubic  interpolation  to  the  step  function,  but  this 
time  using  the  ENO  procedure  with  a  two  point  stencil  S2  (i)  =  {Xj_i  ,  xi+i} 
in  the  Step  2  of  Algorithm  3.1,  we  obtain  a  non-oscillatory  interpolation,  in 
Fig.  3.1  (right). 

For  a  piecewise  smooth  function  V(x),  ENO  interpolation  starting  with 
a  two  point  stencil  §2(1)  =  {Xj_i,xi+i}  in  the  Step  2  of  Algorithm  3.1,  as 
was  shown  in  Fig.  3.1  (right),  has  the  following  properties  [48]: 

1.  The  accuracy  condition 

Pi(x)  =  V(x)+0(Axk+1),  xeli 

is  valid  for  any  cell  Ii  which  does  not  contain  a  discontinuity. 

This  implies  that  the  ENO  interpolation  procedure  can  recover  the  full 
high  order  accuracy  right  up  to  the  discontinuity. 

2.  Pi(x)  is  monotone  in  any  cell  Ii  which  does  contain  a  discontinuity  of 
V(x). 

3.  The  reconstruction  is  TVB  (total  variation  bounded).  That  is,  there  exists 
a  function  z(x),  satisfying 

z(x )  =  Pi{x)  +  0{Axk+l),  x  €  Ii 

for  any  cell  including  those  cells  which  contain  discontinuities,  such 
that 

TV{z)  <  TV{V). 

Property  3  is  clearly  a  consequence  of  Properties  1  and  2  (just  take  z(x)  to 
be  V(x)  in  the  smooth  cells  and  take  z(x)  to  be  Pj(x)  in  the  cells  containing 
discontinuities).  It  is  quite  interesting  that  Property  2  holds.  One  would  have 
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expected  trouble  in  those  “shocked  cells”,  i.e.  cells  p  which  contain  disconti¬ 
nuities,  for  ENO  would  not  help  for  such  cases  as  the  stencil  starts  with  two 
points  already  containing  a  discontinuity.  We  will  give  a  proof  of  Property  2 
for  a  simple  but  illustrative  case,  i.e.  when  V(x)  is  a  step  function 


0,  x  <  0; 
1,  x  >  0. 


and  the  fc-th  degree  polynomial  P(x)  interpolates  V  (x)  at  k  +  1  points 


Xi  <  xs  <  ...  <  xt ,  l 
2  2  2 

containing  the  discontinuity 


x 


Jo -2 


<  0  <  X3o+ 


1 

2 


for  some  jo  between  1  and  k.  For  any  interval  which  does  not  contain  the 
discontinuity  0: 

[Xj_ i,zj+i],  j  ^  Jo,  (3.18) 

we  have 

P(xj- 1)  =  V(Xj-  i)  =  V(xj+i)  =  P(xj+  i), 
hence  there  is  at  least  one  point  in  between,  Xj_  i  <  <  xj+ i ,  such 

that  P'(^j)  =  0.  This  way  we  can  find  k  -  1  distinct  zeroes  for  P'(x),  as 
there  are  A;  -  1  intervals  (3.18)  which  do  not  contain  the  discontinuity  0. 
However,  P'(x)  is  a  non-zero  polynomial  of  degree  at  most  A:  —  1,  hence  can 
have  at  most  k  —  1  distinct  zeroes.  This  implies  that  P'(x)  does  not  have  any 
zero  inside  the  shocked  interval  [a:j0_i,a;j0+i],  i.e.  P( x)  is  monotone  in  this 
shocked  interval.  This  proof  can  be  generalized  to  a  proof  for  Property  2  [48]. 


3.2  WENO  Approximation 

In  this  subsection  we  describe  the  recently  developed  WENO  (weighted  ENO) 
reconstruction  procedure  [69,55].  WENO  is  based  on  ENO,  of  course.  For 
simplicity  of  presentation,  in  this  subsection  we  assume  the  grid  is  uniform, 
i.e.  Axi  =  Ax. 

As  we  can  see  from  Sect.  3.1,  ENO  reconstruction  is  uniformly  high  order 
accurate  right  up  to  the  discontinuity.  It  achieves  this  effect  by  adaptively 
choosing  the  stencil  based  on  the  absolute  values  of  divided  differences.  How¬ 
ever,  one  could  make  the  following  remarks  about  ENO  reconstruction,  indi¬ 
cating  rooms  for  improvements: 

1.  The  stencil  might  change  even  by  a  round-off  error  perturbation  near 
zeroes  of  the  solution  and  its  derivatives.  That  is,  when  both  sides  of 
(3.17)  are  near  0,  a  small  change  at  the  round  off  level  would  change  the 
direction  of  the  inequality  and  hence  the  stencil.  In  smooth  regions,  this 
“free  adaptation”  of  stencils  is  clearly  not  necessary.  Moreover,  this  may 
cause  loss  of  accuracy  when  applied  to  a  hyperbolic  PDE  [83,87]. 
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2.  The  resulting  numerical  flux  (2.23)  is  not  smooth,  as  the  stencil  pattern 
may  change  at  neighboring  points. 

3.  In  the  stencil  choosing  process,  k  candidate  stencils  are  considered,  cov¬ 
ering  2k  —  l  cells,  but  only  one  of  the  stencils  is  actually  used  in  forming 
the  reconstruction  (2.10)  or  the  flux  (2.30),  resulting  in  fc-th  order  accu¬ 
racy.  If  all  the  2k  —  1  cells  in  the  potential  stencils  are  used,  one  could 
get  (2k  —  l)-th  order  accuracy  in  smooth  regions. 

4.  ENO  stencil  choosing  procedure  involves  many  logical  “if”  structures,  or 
equivalent  mathematical  formulae,  which  are  not  very  efficient  on  certain 
vector  computers  such  as  CRAYs  (however  they  are  friendly  to  parallel 
computers). 


There  have  been  attempts  in  the  literature  to  rectify  the  first  problem,  the 
“free  adaptation”  of  stencils.  In  [31]  and  [87],  the  following  “biasing”  strategy 
was  proposed.  One  first  identity  a  “preferred”  stencil 

Spref(i)  —  •••>  ^i-r+fc+i  }  )  (3.19) 


which  might  be  central  or  one-point  upwind.  One  then  replaces  (3.17)  by 


V[x 


3~V 


•’ Xj+l - 


<b 


'f-H+l-l 


if 


Xj+%  ^  ®t-r+i  > 

i.e.  if  the  left-most  point  xJ+i  in  the  current  stencil  Si(i)  has  not  reached 
the  left-most  point  xi_r+j.  of  the  preferred  stencil  Spref{i )  in  (3.19)  yet; 
otherwise,  if 

Xj+i  -  Xi~r+j  > 

one  replaces  (3.17)  by 


V[a 


.Zj+i-ill  < 


\v[xj+i,...,xj+l+i] 


Here,  b  >  1  is  the  so-called  biasing  parameter.  Analysis  in  [87]  indicates  a 
good  choice  of  the  parameter  6  =  2.  The  philosophy  is  to  stay  as  close  as 
possible  to  the  preferred  stencil,  unless  the  alternative  candidate  is,  roughly 
speaking,  a  factor  6  >  1  better  in  smoothness. 

WENO  is  a  more  recent  attempt  to  improve  upon  ENO  in  these  four 
points.  The  basic  idea  is  the  following:  instead  of  using  only  one  of  the  can¬ 
didate  stencils  to  form  the  reconstruction,  one  uses  a  convex  combination  of 
all  of  them.  To  be  more  precise,  suppose  the  k  candidate  stencils 


S/- (0  —  {Ai— r,  ...,  Xi— r+k— 1  }>  T  —  0, ...,  k  1  (3.20) 


produce  k  different  reconstructions  to  the  value  vi+i.  according  to  (2.11), 


k- 1 


^  ]  crjvi-r+j , 
3=0 


r  =  0, ...,  k  —  1 , 


(3.21) 
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WENO  reconstruction  would  take  a  convex  combination  of  all  v^\  defined 
in  (3.21)  as  a  new  approximation  to  the  cell  boundary  value  w(xi+i): 

Jfe-i 

ui+i  =  (3.22) 

r=0 

Apparently,  the  key  to  the  success  of  WENO  would  be  the  choice  of  the 
weights  wr.  We  require 

fc— l 

wr>0,  =  1  (3.23) 

r= 0 


for  stability  and  consistency. 

If  the  function  v(x)  is  smooth  in  all  of  the  candidate  stencils  (3.20),  there 
are  constants  dr  such  that 


vi+h  =  S  drVi+±  =  v(xi+0  +  °(Ax2k  *)  • 

r=0  2 

For  example,  dr  for  1  <  k  <  3  are  given  by 


do  =  1,  k  =  1; 


We  can  see  that  dr  is  always  positive  and,  due  to  consistency, 


(3.24) 


*— i 

^2  dr  =  1. 

r= 0 

In  this  smooth  case,  we  would  like  to  have 

wr  =  dr  +  0{Axk~l),  r  =  0,  —  1, 

which  would  imply  (2k  —  l)-th  order  accuracy: 

vi+l  =  J2UrVi+ 1  =  v(xi+ j)  +  0(Ax2k~k) 

r= 0 


(3.25) 


(3.26) 


(3.27) 


y:  urvl2i  dT) 

r— 0  r=0  t— 0 

k- 1 

=  Y,°(Axk~1)°(Axk) 

r=0 

=  0(Ax2k~1) 


because 
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where  in  the  first  equality  we  used  (3.23)  and  (3.25). 

When  the  function  v(x)  has  a  discontinuity  in  one  or  more  of  the  stencils 
(3.20),  we  would  hope  the  corresponding  weight(s)  u)T  to  be  essentially  0,  to 
emulate  the  successful  ENO  idea. 

Another  consideration  is  that  the  weights  should  be  smooth  functions  of 
the  cell  averages  involved.  In  fact,  the  weights  designed  in  [55]  and  described 
below  are  C°°. 

Finally,  we  would  like  to  have  weights  which  are  computationally  efficient. 
Thus,  polynomials  or  rational  functions  are  preferred  over  exponential  type 
functions. 

All  these  considerations  and  ample  numerical  experiments  lead  to  the 
following  form  of  weights: 


U)r  — 


r  =  0  1 


(3.28) 


with 


df 

ar=  (e  +  &)2' 


(3.29) 


Here  e  >  0  is  introduced  to  avoid  the  denominator  to  become  0.  We  take  e  = 
10-6  in  all  our  numerical  tests  [55].  (3r  are  the  so-called  “smooth  indicators” 
of  the  stencil  Sr(i):  if  the  function  v(x)  is  smooth  in  the  stencil  Sr(i),  then 


Pr  =  0(Ax2) , 


but  if  v(x)  has  a  discontinuity  inside  the  stencil  Sr(i),  then 


Pr  =  0(1). 

Translating  into  the  weights  ur  in  (3.28),  we  will  have 

ur  =  0(1) 

when  the  function  v(x)  is  smooth  in  the  stencil  Sr(i),  and 

car  =  0(Ax4) 

if  v(x)  has  a  discontinuity  inside  the  stencil  Sr(i).  Emulation  of  ENO  near  a 
discontinuity  is  thus  achieved. 

One  also  has  to  worry  about  the  accuracy  requirement  (3.26),  which  must 
be  checked  when  the  specific  form  of  the  smooth  indicator  (3r  is  given.  For  any 
smooth  indicator  f3r,  it  is  easy  to  see  that  the  weights  defined  by  (3.28)  sat¬ 
isfies  (3.23).  To  satisfy  (3.26),  it  suffices  to  have,  through  a  Taylor  expansion 
analysis: 

Pr  =  D(l  +  0(Axk~1)),  r  =  0,  1,  (3.30) 

where  D  is  a  nonzero  quantity  independent  of  r  (but  may  depend  on  Ax). 
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As  we  have  seen  in  Sect.  3.1,  the  ENO  reconstruction  procedure  chooses 
the  “smoothest”  stencil  by  comparing  a  hierarchy  of  divided  or  undivided 
differences.  This  is  because  these  differences  can  be  used  to  measure  the 
smoothness  of  the  function  on  a  stencil,  (3.8)-(3.9).  In  [55],  after  extensive 
experiments,  a  robust  (for  third  and  fifth  order  at  least)  choice  of  smooth 
indicators  pr  is  given.  As  we  know,  on  each  stencil  Sr(i),  we  can  construct 
a  (k  —  l)-th  degree  reconstruction  polynomial,  which  if  evaluated  at  x  = 
xi+i,  renders  the  approximation  to  the  value  v(xi+i)  in  (3.21).  Since  the 
total  variation  is  a  good  measurement  for  smoothness,  it  would  be  desirable 
to  minimize  the  total  variation  for  this  reconstruction  polynomial  inside  7j. 
Consideration  for  a  smooth  flux  and  for  the  role  of  higher  order  variations 
leads  us  to  the  following  measurement  for  smoothness:  let  the  reconstruction 
polynomial  on  the  stencil  Sr(i )  be  denoted  by  pr(x),  we  define 


The  right  hand  side  of  (3.31)  is  just  a  sum  of  the  squares  of  scaled  L2  norms 
for  all  the  derivatives  of  the  interpolation  polynomial  pr(x)  over  the  interval 
(x{_ i,xi+i).  The  factor  Ax 21-1  is  introduced  to  remove  any  Ax  dependency 
in  the  derivatives,  in  order  to  preserve  self-similarity  when  used  to  hyperbolic 
PDEs  (Sect.  4). 

We  remark  that  (3.31)  is  similar  to  but  smoother  than  the  total  variation 
measurement  based  on  the  L 1  norm.  It  also  renders  a  more  accurate  WENO 
scheme  for  the  case  k  =  2  and  3. 

When  k  =  2,  (3.31)  gives  the  following  smoothness  measurement  [69,55]: 

A)  =  &+ 1  -  Vi)2  ,  pi  =  (vi  -  Vi- 1)2  .  (3.32) 

For  k  =  3,  (3.31)  gives  [55]: 

13  1 

Po  =  —  _  2Fi+i  +  Vi+ 2)2  +  -(3tJj  -  4Vi+i  +  Vi+2)2  , 

Pi  =  ^(Fi-1  -  2 Vi  +  vi+i)2  +  i(ui_  1  -  vi+i)2  ,  (3.33) 

13  1 

Pi  =  —  (Vi-2  -  2Fi_i  +  Vi)2  +  -{Vi- 2  -  4vi-i  +  3 Vi)2  . 

We  can  easily  verify  that  the  accuracy  condition  (3.30)  is  satisfied,  even  near 
smooth  extrema  [55].  This  indicates  that  (3.32)  gives  a  third  order  WENO 
scheme,  and  (3.33)  gives  a  fifth  order  one. 

Notice  that  the  discussion  here  has  a  one  point  upwind  bias  in  the  op¬ 
timal  linear  stencil,  suitable  for  a  problem  with  wind  blowing  from  left  to 
right.  If  the  wind  blows  the  other  way,  the  procedure  should  be  modified 
symmetrically  with  respect  to  xi+ 1 . 

In  summary,  we  have  the  following  WENO  reconstruction  procedure: 
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Algorithm  3.2.  ID  WENO  reconstruction. 

Given  the  cell  averages  {vi}  of  a  function  v(x),  for  each  cell  we  obtain 
upwind  biased  (2k  —  l)-th  order  approximations  to  the  function  v(x)  at  the 
cell  boundaries,  denoted  by  vf  1  and  v~  1 ,  in  the  following  way: 

1.  Obtain  the  k  reconstructed  values  v(.’\  ,  of  k- th  order  accuracy,  in  (3.21), 
based  on  the  stencils  (3.20),  for  r  =  0, ..., k  —  1; 

Also  obtain  the  k  reconstructed  values  v^L ,  of  k- th  order  accuracy,  using 

(2.10),  again  based  on  the  stencils  (3.20),  for  r  =  0, ...,  k  -  1; 

2.  Find  the  constants  dr  and  dr,  such  that  (3.24)  and 

fc-i 

Vi- 1  =  =  v(xi-0  +  0(Ax2k~1) 

r=0 

are  valid.  By  symmetry, 

dr  “  djfe-l-r- 

3.  Find  the  smooth  indicators  /3r  in  (3.31),  for  all  r  =  0, ...,  k  -  1.  Explicit 
formulae  for  k  =  2  and  k  =  3  are  given  in  (3.32)  and  (3.33)  respectively. 

4.  Form  the  weights  ur  and  ur  using  (3.28)-(3.29)  and 


u>r 


_  dr 

ar=  (TT&p 


5.  Find  the  (2k  -  l)-th  order  reconstruction 


r  =  0, ...,  k  —  1. 


*-i 


=  E 


(r) 


k- 1 


r=0 


(3.34) 

□ 


We  can  obtain  weights  for  higher  orders  of  k  (corresponding  to  seventh 
and  higher  order  WENO  schemes)  using  the  same  recipe.  However,  these 
schemes  of  seventh  and  higher  order  have  not  been  extensively  tested  yet. 
Current  research  of  Balsara  and  Shu  [5]  addresses  this  issue. 


4  ENO  and  WENO  Schemes  in  One  Dimension 

In  this  section  we  describe  the  ENO  and  WENO  schemes  for  one  dimensional 
conservation  laws: 

ut(x,  t)  +  fx(u(x,  t ))  =  0  (4.1) 

equipped  with  suitable  initial  and  boundary  conditions. 

We  will  concentrate  on  the  discussion  of  spatial  discretization,  and  will 
leave  the  time  variable  t  continuous  (the  method-of-lines  approach).  Time 
discretization  will  be  discussed  in  Sect.  9. 
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Our  computational  domain  is  a  <  x  <  b.  We  have  a  grid  defined  by  (2.1), 
with  the  notations  (2.2)-(2.3).  Except  for  in  Sect,  4.5,  we  do  not  consider 
boundary  conditions.  We  thus  assume  that  the  values  of  the  numerical  solu¬ 
tion  are  also  available  outside  the  computational  domain  whenever  they  are 
needed.  This  would  be  the  case  for  periodic  or  compactly  supported  problems. 


4.1  Finite  Volume  Formulation  in  the  Scalar  Case 


For  finite  volume  schemes,  or  schemes  based  on  cell  averages,  we  do  not  solve 
(4.1)  directly,  but  its  integrated  version.  We  integrate  (4.1)  over  the  interval 
li  to  obtain 

=  -^7  (/(“(*»+  $»*)  -  /(«(*«_$,*)))  ,  (4.2) 

where 

u(xi,t)  =  — J  u(t,t)d£  (4.3) 

x*~i 

is  the  cell  average.  We  approximate  (4.2)  by  the  following  conservative  scheme 


duj(t) 

dt 


Axi  (^+2  ^-0  ’ 


(4.4) 


where  ui(t)  is  the  numerical  approximation  to  the  cell  average  u(xi,t),  and 
the  numerical  flux  fi+ 1  is  defined  by 


fi+i=h{u.H,u++i) 


(4.5) 


with  the  values  u.x  obtained  by  the  ENO  reconstruction  Algorithm  3.1,  or 
by  the  WENO  reconstruction  Algorithm  3.2. 

The  two  argument  function  h  in  (4.5)  is  a  monotone  flux.  It  satisfies: 

-  h(a,  b )  is  a  Lipschitz  continuous  function  in  both  arguments; 

-  h(a,  b)  is  a  nondecreasing  function  in  a  and  a  nonincreasing  function  in 
b.  Symbolically  h(t,4-); 

-  h(a,b)  is  consistent  with  the  physical  flux  /,  that  is,  h(a,a )  =  f(a). 
Examples  of  monotone  fluxes  include: 

1.  Godunov  flux: 


f  min0<„<(,  f(u)  if  a  <  b 
\  max(,<u<a  f(u)  if  a  >  b  ’ 


2.  Engquist-Osher  flux: 


(4.6) 


pa  pb 

h(a,b)  =  /  max(/,(u),0)du+  /  min(/'(u),0)du  4-  /(0).  (4.7) 

Jo  Jo 
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3.  Lax-Friedrichs  flux: 

i  b)  =  \  t/(a)  +  f(b )  ~  a(b  ~  «)]  (4-8) 

where  a  =  max„  |/'(u)|  is  a  constant.  The  maximum  is  taken  over  the 

relevant  range  of  u. 

We  have  listed  the  monotone  fluxes  from  the  least  dissipative  (less  smearing 
of  discontinuities)  to  the  most.  For  lower  order  methods  (order  of  reconstruc¬ 
tion  is  1  or  2),  there  is  a  big  difference  between  results  obtained  by  different 
monotone  fluxes.  However,  this  difference  becomes  much  smaller  for  higher 
order  reconstructions.  In  Fig.  4.1,  we  plot  the  results  of  a  right  moving  shock 
for  the  Burgers’  equation  ( f(u )  =  \  in  (4.1)),  with  first  order  reconstruction 
using  Godunov  and  Lax-Friedrichs  monotone  fluxes  (top),  and  with  fourth  or¬ 
der  ENO  reconstruction  using  Godunov  and  Lax-Friedrichs  monotone  fluxes 
(bottom).  We  can  clearly  see  that,  while  the  Godunov  flux  behaves  much 
better  for  the  first  order  scheme,  the  two  fourth  order  ENO  schemes  behave 
similarly.  We  thus  use  the  simple  and  inexpensive  Lax-Friedrichs  flux  in  most 
of  our  high  order  calculations. 

We  remark  that,  by  the  classical  Lax-Wendroff  theorem  [65],  the  solution 
to  the  conservative  scheme  (4.4),  if  converges ,  will  converge  to  a  weak  solution 
of  (4.1). 

In  summary,  to  build  a  finite  volume  ENO  scheme  (4.4),  given  the  cell 
averages  {ut}  (we  will  often  drop  the  explicit  reference  to  the  time  variable 
f),  we  proceed  as  follows: 

Algorithm  4.1.  Finite  volume  ID  scalar  ENO  and  WENO  Schemes. 

1.  Follow  the  Algorithm  3.1  in  Sect.  3.1  for  ENO,  or  the  Algorithm  3.2  in 

Sect.  3.2  for  WENO,  to  obtain  the  fc-th  order  reconstructed  values  u7  i 

*+2 

and  u  ,  ,  for  all  i\ 

*+  2 

2.  Choose  a  monotone  flux  (e.g.,  one  of  (4.6)  to  (4.8)),  and  use  (4.5)  to 

compute  the  flux  fi+i  for  all  i; 

3.  Form  the  scheme  (4.4). 

□ 

Notice  that  the  finite  volume  scheme  can  be  applied  to  arbitrary  nonuni¬ 
form  grids. 


4.2  Finite  Difference  Formulation  in  the  Scalar  Case 


We  first  assume  the  grid  is  uniform  and  solve  (4.1)  directly  using  a  conserva¬ 
tive  approximation  to  the  spatial  derivative: 


dui(t ) 


Ax 


(/i+i  f'-i) 


dt 


(4.9) 
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Fig.  4.1.  First  order  (top)  and  fourth  order  (bottom)  ENO  schemes  for  the  Burgers 
equation,  with  the  Godunov  flux  (left)  and  the  Lax- Friedrichs  flux  (right).  Solid 
lines:  exact  solution;  Circles:  the  computed  solution  at  t  =  4. 
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where  Ui(t)  is  the  numerical  approximation  to  the  point  value  u(xi,t),  and 
the  numerical  flux 

/j+i  —  f{ui—n 

satisfies  the  following  conditions: 

—  jF  is  a  Lipschitz  continuous  function  in  all  the  arguments; 

-  /  is  consistent  with  the  physical  flux  f,  that  is,  f(u,...,u)  =  f(u). 


Again  the  Lax-Wendroff  theorem  [65]  applies.  The  solution  to  the  conser¬ 
vative  scheme  (4.9),  if  converges ,  will  converge  to  a  weak  solution  of  (4.1). 

The  numerical  flux  fi+i  is  obtained  by  the  ENO  or  WENO  reconstruction 
procedures,  Algorithm  3.1  or  3.2,  with  v(x)  =  f(u(x,t)).  For  stability,  it  is 
important  that  upwinding  is  used  in  constructing  the  flux.  The  easiest  and 
the  most  inexpensive  way  to  achieve  upwinding  is  the  following:  compute  the 
Roe  speed 


ai+i 


f(Ui+ 1)  ~  fjui) 

"Wi-j-l  Ui 


(4.10) 


(when  itj+i  =  Ui  one  should  use  ai+ 1  =  f'(ui))  and 

-  ifai+i  >  0,  then  the  the  wind  blows  from  the  left  to  the  right.  We  would 
use  v~[  L  for  the  numerical  flux  fi+k] 

-  if  ai+ 1  <  0,  then  the  wind  blows  from  the  right  to  the  left.  We  would  use 

v+  i  for  the  numerical  flux  fi+ i . 

*•  2  2 


This  produces  the  Roe  scheme  [82]  at  the  first  order  level.  For  this  reason, 
the  ENO  scheme  based  on  this  approach  was  termed  “ENO-Roe”  in  [90]. 

In  summary,  to  build  a  finite  difference  ENO  scheme  (4.9)  using  the  ENO- 
Roe  approach,  given  the  point  values  {u*}  (we  again  drop  the  explicit  reference 
to  the  time  variable  t),  we  proceed  as  follows: 


Algorithm  4.2.  Finite  difference  ID  scalar  ENO-Roe  and  WENO- 
Roe  schemes. 

1.  Compute  the  Roe  speed  ai+ i  for  all  i  using  (4.10); 

2.  Identify  JJi  =  f(ui )  and  use  the  ENO  reconstruction  Algorithm  3.1  or  the 
WENO  reconstruction  Algorithm  3.2,  to  obtain  the  cell  boundary  values 
v~  i  if  a,- ,  i  >  0,  or  v+.  ,  if  a,- .  i  <  0; 

3.  If  the  Roe  speed  at  xi+i  is  positive 

ai+ 1  ^  0, 

then  take  the  numerical  flux  as: 


otherwise,  take  the  the  numerical  flux  as: 
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4.  Form  the  scheme  (4.9). 


□ 


One  disadvantage  of  the  ENO-Roe  approach  is  that  entropy  violating 
solutions  may  be  obtained,  just  like  in  the  first  order  Roe  scheme  case.  For 
example,  if  ENO-Roe  is  applied  to  the  Burgers  equation 


with  the  following  initial  condition 


— 1,  if  x  <  0, 
1,  if  x  >  0, 


it  will  converge  to  the  entropy  violating  expansion  shock: 


-1,  if  x  <  0, 
1,  if  x  >  0. 


Local  entropy  correction  could  be  used  to  rectify  this  [90].  However,  it  is 
usually  more  robust  to  use  a  global  “flux  splitting” : 


f(u)  =  f+(u)  +  f  ( u ) 


(4.11) 


where 


df+(u) 

du 


>0, 


df  ( u ) 
du 


<  o. 


(4.12) 


We  would  need  the  positive  and  negative  fluxes  /±(u)  to  have  as  many  deriva¬ 
tives  as  the  order  of  the  scheme.  This  unfortunately  rules  out  many  popular 
flux  splittings  (such  as  those  of  van  Leer  [101]  and  Osher  [77])  for  high  order 
methods  in  this  framework. 

The  simplest  smooth  splitting  is  the  Lax-Friedrichs  splitting: 


/*(«)  =  \{f(u)±au) 


(4.13) 


where  a  is  again  taken  as  a  =  maxu  |/'(tt)|  over  the  relevant  range  of  u. 

We  note  that  there  is  a  close  relationship  between  a  flux  splitting  (4.11) 
and  a  monotone  flux  (4.5).  In  fact,  for  any  flux  splitting  (4.11)  satisfying 
(4.12), 

h(a,  b )  =  f+(a)  +  f  ( b )  (4.14) 

is  clearly  a  monotone  flux.  However,  not  every  monotone  flux  can  be  written 
in  the  flux  split  form  (4.11).  For  example,  the  Godunov  flux  (4.6)  cannot. 

With  the  flux  splitting  (4.11),  we  apply  the  the  ENO  or  WENO  recon¬ 
struction  procedures,  Algorithm  3.1  or  3.2,  with  v(x )  =  f+(u(x,t ))  and 
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v{x)  =  /  (u(x,  t))  separately,  to  obtain  two  numerical  fluxes  ff  t  and  /  , , 

2 

and  then  sum  them  to  get  the  numerical  flux  /i+i- 

In  summary,  to  build  a  finite  difference  ENO  or  WENO  scheme  (4.9)  using 
the  flux  splitting  approach,  given  the  point  values  {it;},  we  proceed  as  follows: 

Algorithm  4.3.  Finite  difference  ID  scalar  flux  splitting  ENO  and 
WENO  schemes. 

1. 

2. 


3. 


4. 


5. 


6. 


7. 

□ 

We  remark  that  the  finite  difference  scheme  in  this  section  and  the  finite 
volume  scheme  in  Sect.  4.1  are  equivalent  for  one  dimensional,  linear  PDE 
with  constant  coefficients:  the  only  difference  is  in  the  initial  condition  (the 
finite  difference  version  uses  point  values  and  the  finite  volume  version  uses 
cell  averages  of  the  exact  initial  condition).  Notice  that  the  schemes  are  still 
nonlinear  in  this  case.  However,  this  equivalency  does  not  hold  for  a  nonlinear 
PDE.  Moreover,  we  will  see  later  that  there  are  significant  differences  in 
efficiency  of  the  two  approaches  for  multidimensional  problems. 

In  the  following  we  test  the  accuracy  of  the  fifth  order  finite  difference 
WENO  schemes  on  the  linear  equation: 

ut+ux  =  0,  -1  <  x  <  1 

u(x,  0)  =  u0(x)  periodic. 

In  Table  4.1,  we  show  the  errors  of  the  fifth  order  WENO  scheme  given  by 
the  weights  (3.28)-(3.29)  with  the  smooth  indicator  (3.33),  at  time  t  =  1 


Find  a  smooth  flux  splitting  (4.11),  satisfying  (4.12); 

Identify  Ui  —  f+(ui)  and  use  the  ENO  or  WENO  reconstruction  proce¬ 
dure,  Algorithm  3.1  or  3.2,  to  obtain  the  cell  boundary  values  v7  ,  for 

l~r  2 

all  i; 

Take  the  positive  numerical  flux  as 

Identify  tq  —  f~(ui)  and  use  the  ENO  or  WENO  reconstruction  proce¬ 
dures,  Algorithm  3.1  or  3.2,  to  obtain  the  cell  boundary  values  vf.,  for 

*+2 

all  i; 

Take  the  negative  numerical  flux  as 


Form  the  numerical  flux  as 


fi+ 1  +  ^i+  i  ’ 


Form  the  scheme  (4.9). 
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for  the  initial  condition  uo(x )  =  sin(7ra;),  and  compare  them  with  the  errors 
of  the  linear  5-th  order  upstream  central  scheme  (i.e.  the  scheme  with  the 
linear  weights  dr  as  in  (3.24)).  We  can  see  that  fifth  order  WENO  gives  the 
expected  order  of  accuracy  starting  at  about  40  grid  points. 


Table  4.1.  Accuracy  on  ut  4-  ux  =  0  with  u o{x)  =  sin(7rx). 


Fifth  order  WENO  scheme 


N 

Loo  error 

Loo  order 

Li  error 

L  i  order 

10 

2.98e-2 

- 

1.60e-2 

- 

20 

1.45e-3 

4.36 

7.41e-4 

4.43 

40 

4.58e-5 

4.99 

2.22e-5 

5.06 

80 

1.48e-6 

4.95 

6.91e-7 

5.01 

160 

4.41e-8 

5.07 

2.17e-8 

4.99 

320 

1.35e-9 

5.03 

6.79e-10 

5.00 

Fifth  order  linear  upwind-central  scheme 


N 

Loo  error 

Loo  order 

Li  error 

Li  order 

10 

4.98e-3 

- 

3.07e-3 

- 

20 

1.60e-4 

4.96 

9.92e-5 

4.95 

40 

5.03e-6 

4.99 

3.14e-6 

4.98 

80 

1.57e-7 

5.00 

9.90e-8 

4.99 

160 

4.91e-9 

5.00 

3.11e-9 

4.99 

320 

1.53e-10 

5.00 

9.73e-ll 

5.00 

In  Table  4.2,  we  show  errors  for  the  initial  condition  uo(x)  =  sin4 (ttx). 
The  order  of  accuracy  for  the  fifth  order  WENO  settles  down  later  than  in 
the  previous  example.  Notice  that  this  is  the  example  for  which  ENO  schemes 
lose  their  accuracy  [83],  [87]. 

We  emphasize  again  that  the  high  order  conservative  finite  difference  ENO 
and  WENO  schemes  of  third  or  higher  order  accuracy  can  only  be  applied 
to  a  uniform  grid  or  a  smoothly  varying  grid,  i.e.  a  grid  such  that  a  smooth 
transformation 

£  =  £(*) 

will  result  in  a  uniform  grid  in  the  new  variable  £.  Here  £  must  contain  as 
many  derivatives  as  the  order  of  accuracy  of  scheme  calls  for.  If  this  is  the 
case,  then  (4.1)  is  transformed  to 


ut  +  Zxf{u)t  -  0 
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Table  4.2.  Accuracy  on  ut  +  ux  =  0  with  uo(x)  =  sm4(Trx). 


Fifth  order  WENO  scheme 


N 

Loo  error 

Loo  order 

Li  error 

Li  order 

20 

1.08e-l 

- 

4.91e-2 

- 

40 

8.90e-3 

3.60 

3.64e-3 

3.75 

80 

1.80e-3 

2.31 

5.00e-4 

2.86 

160 

1.22e-4 

3.88 

2.17e-5 

4.53 

320 

4.37e-6 

4.80 

6.17e-7 

5.14 

640 

9.79e-8 

5.48 

1.57e-8 

5.30 

Fifth  order  linear  upwind-central  scheme 


N 

Loo  error 

Loo  order 

L\  error 

Li  order 

20 

5.23e-2 

- 

3.35e-2 

- 

40 

2.47e-3 

4.40 

1.52e-3 

4.46 

80 

8.32e-5 

4.89 

5.09e-5 

4.90 

160 

2.65e-6 

4.97 

1.60e-6 

4.99 

320 

8.31e-8 

5.00 

4.99e-8 

5.00 

640 

2.60e-9 

5.00 

1.56e-9 

5.00 

and  the  conservative  ENO  or  WENO  derivative  approximation  is  then  applied 
to  f(u )$.  It  is  proven  in  [77]  that  this  way  the  scheme  is  still  conservative, 
i.e.  Lax-Wendroff  theorem  [65]  still  applies. 


4.3  Provable  Properties  in  the  Scalar  Case 

Second  order  ENO  schemes  are  also  TVD  (total  variation  diminishing),  hence 
have  at  least  subsequences  which  converge  to  weak  solutions.  There  is  no 
known  convergence  result  for  ENO  schemes  of  degree  higher  than  2,  even  for 
smooth  solutions. 

WENO  schemes  have  better  convergence  results,  mainly  because  their 
numerical  fluxes  are  smoother.  It  is  proven  [55]  that  WENO  schemes  converge 
for  smooth  solutions.  Also,  Jiang  and  Yu  [56]  have  obtained  an  existence 
proof  for  traveling  waves  for  WENO  schemes.  This  is  an  important  first  step 
towards  the  proof  of  convergence  for  shocked  cases. 

Even  though  there  are  very  few  theoretical  results  about  ENO  or  WENO 
schemes,  in  practice  these  schemes  are  very  robust  and  stable.  We  caution 
against  any  attempts  to  modify  the  schemes  solely  for  the  purpose  of  stability 
or  convergence  proofs.  In  [89]  we  gave  a  remark  about  a  modification  of  ENO 
schemes,  which  keeps  the  formal  uniform  high  order  accuracy  and  makes 
them  stable  and  convergent  for  general  multi  dimensional  scalar  equations. 
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However  it  was  pointed  out  there  that  the  modification  is  not  computationally 
useful,  hence  the  convergence  result  has  little  value. 

The  remark  in  [89]  is  illustrative  hence  we  reproduce  it  here.  We  start 
with  a  flux  splitting  (4.11)  satisfying  (4.12),  and  notice  that  the  first  order 
monotone  scheme 

~iit  =  -/£jT  ^ + ^  ~  +  f~(ui+ 1)  “  f~(ui ))  =  Ri(u)i  (4-15) 

is  convergent  (also  for  multi  space  dimensions).  We  now  construct  a  high  order 
ENO  approximation  in  the  following  way:  starting  from  the  two  point  stencil 
{a;j_i ,  Xi },  we  expand  it  into  a  k  + 1  point  stencil  in  an  ENO  fashion  using  the 
divided  differences  of  f+(u(x)).  We  then  build  the  fc-th  degree  polynomial 
P+(x)  which  interpolates  f+(u(x))  in  this  stencil.  P~(x)  is  constructed  in 
a  similar  way,  starting  from  the  two  point  stencil  {zj.Xj+i}.  The  scheme  is 
finally  defined  as 

^  (P+M  +  r-(*»|„„  =  R« («)<  (4-i6) 

This  scheme  is  clearly  fc-th  order  accurate  but  is  not  conservative.  We  now 
denote  the  difference  between  the  high  order  scheme  (4.16)  and  the  first  order 
monotone  scheme  (4.15)  by 


P(^)i  —  Rki^i  Pi  (^)ij 


(4.17) 


and  limit  it  by 

D{u)i=fn(D(u)i,MAxa),  (4.18) 

where  M  >  0  and  0  <  a  <  1  are  constants,  and  the  capping  function  m  is 
defined  by 

{a,  if  |a|  <  b ; 
b,  if  a  >  b ; 

—b,  if  a  <  —  b . 

The  modified  ENO  scheme  is  then  defined  by 


=  Rk(u)i  =  Ri(u)i  +  D(u)i.  (4.19) 


We  notice  that,  in  smooth  regions,  the  difference  between  the  first  order  and 
high  order  residues,  D(u)i,  as  defined  in  (4.17),  is  of  the  size  O(Ax),  hence 
the  capping  (4.18)  does  not  take  effect  in  such  regions,  if  a  <  1  or  if  a  =  1 
and  M  is  large  enough,  when  Ax  is  sufficiently  small.  This  implies  that  the 
scheme  (4.19)  is  uniformly  accurate.  Moreover,  since 


Rki^Pji  RliV^)i 


<  MAxa 


by  (4.18),  the  high  order  scheme  (4.19)  shares  every  good  property  of  the  first 
order  monotone  scheme  (4.15),  such  as  total  variation  boundedness,  entropy 
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conditions,  and  convergence.  Prom  a  theoretical  point  of  view,  this  is  the 
strongest  result  one  could  possibly  hope  for  a  high  order  scheme.  However,  the 
mesh  size  dependent  limiting  (4.18)  renders  the  scheme  highly  impractical: 
the  quality  of  the  numerical  solution  will  depend  strongly  on  the  choice  of 
the  parameters  M  and  a,  as  well  as  on  the  mesh  size  Ax. 


4.4  Systems 

We  only  consider  hyperbolic  m  x  m  systems,  i.e.  the  Jocobian  /'(it)  has  m 
real  eigenvalues 

Ai(«)  <  ...  <Am(u)  (4.20) 

and  a  complete  set  of  independent  eigenvectors 

ri(u),...,rm(u) .  (4.21) 

We  denote  the  matrix  whose  columns  are  eigenvectors  (4.21)  by 

R(u)  =  (r1  («),...,  rm  (u) )  (4.22) 


Then  clearly 

R-1(u)f'(u)R(u)  =  A(u)  (4.23) 

where  A(u)  is  the  diagonal  matrix  with  A]  (it), ...,  Xm(u)  on  the  diagonal. 
Notice  that  the  rows  of  f?-1(u),  denoted  by  h(u), ...,  lm(u)  (row  vectors),  are 
left  eigenvectors  of  /'(it): 

k(u)f'( it)  =  Xi(u)k{u),  i  =  1, ...,  m .  (4.24) 

There  are  several  ways  to  generalize  scalar  ENO  or  WENO  schemes  to 
systems. 

The  easiest  way  is  to  apply  the  ENO  or  WENO  schemes  in  a  component 
by  component  fashion.  For  the  finite  volume  formulation,  this  means  that  we 
make  the  reconstruction  using  ENO  or  WENO  for  each  of  the  components  of 
u  separately.  This  produces  the  left  and  right  values  x  at  the  cell  interface 

xl  {  i .  An  exact  or  approximate  Riemann  solver,  h(u7+L,iiJ~+1),  is  then  used 
to  build  the  scheme  (4.4)-(4.5).  The  exact  Riemann  solver  is  given  by  the 
exact  solution  of  (4.1)  with  the  following  step  function  as  initial  condition 


u(x, 0) 


(u.+  k,x<0-, 
\u++h,x>0, 


(4.25) 


evaluated  at  the  center  x  =  0.  Notice  that  the  solution  to  (4.1)  with  the 
initial  condition  (4.25)  is  self-similar,  that  is,  it  is  a  function  of  the  variable 
£  =  f,  hence  is  constant  along  x  =  0.  If  we  denote  this  solution  by  ui+i , 
then  the  flux  is  taken  as 


MVn<i)=/(%  *)■ 
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In  the  scalar  case,  the  exact  Riemann  solver  gives  the  Godunov  flux  (4.6). 
Exact  Riemann  solver  can  be  obtained  for  many  systems  including  the  Euler 
equations  of  compressible  gas,  which  is  used  very  often  in  practice.  However, 
it  is  usually  very  costly  to  get  this  solution  (for  Euler  equations  of  compress¬ 
ible  gas,  an  iterative  procedure  is  needed  to  obtain  this  solution,  see  [94]).  In 
practice,  approximate  Riemann  solvers  are  usually  good  enough.  As  in  the 
scalar  case,  the  quality  of  the  solution  is  usually  very  sensitive  to  the  choice 
of  approximate  Riemann  solvers  for  lower  order  schemes  (first  or  second  or¬ 
der),  but  this  sensitivity  decreases  with  an  increasing  order  of  accuracy.  The 
simplest  approximate  Riemann  solver  (albeit  the  most  dissipative)  is  again 
the  Lax-Friedrichs  solver  (4.8),  except  that  now  the  constant  a  is  taken  as 

a  =  max  max  [A,(u)|  (4-26) 

u  l<j<m 

where  \j(u)  are  the  eigenvalues  of  the  Jacobian  f'(u),  (4.20).  The  maximum 
is  again  taken  over  the  relevant  range  of  u. 

We  summarize  the  procedure  in  the  following 

Algorithm  4.4.  Component- wise  finite  volume  ID  system  ENO  and 
WENO  schemes. 

1.  For  each  component  of  the  solution  u,  apply  the  scalar  ENO  Algorithm 

3.1  or  WENO  Algorithm  3.2  to  reconstruct  the  corresponding  component 

of  the  solution  at  the  cell  interfaces,  l  for  all  i; 

*+2 

2.  Apply  an  exact  or  approximate  Riemann  solver  to  compute  the  flux  fi+i 
for  all  i  in  (4.5); 

3.  Form  the  scheme  (4.4). 

□ 


For  the  finite  difference  formulation,  a  smooth  flux  splitting  (4.11)  is  again 
needed.  The  condition  (4.12)  now  becomes  that  the  two  Jacobians 


df+(u)  df  ( u ) 
du  ’  du 


(4.27) 


are  still  diagonalizable  (preferably  by  the  same  eigenvectors  R(u)  as  for 
/'(u)),  and  have  only  non-negative  /  non-positive  eigenvalues,  respectively. 
We  again  recommend  the  Lax-Friedrichs  flux  splitting  (4.13),  with  a  given 
by  (4.26),  because  of  its  simplicity  and  smoothness.  A  somewhat  more  com¬ 
plicated  Lax-Friedrichs  type  flux  splitting  is: 


/*(«)  =  ^(/(«)±R(u)AR  l(u)u), 


where  R(u)  and  R  1(u)  are  defined  in  (4.22),  and 


A  =  diag(Xi, Xm) 
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where  A j  =  maxu\Xj(u)\,  and  the  maximum  is  again  taken  over  the  relevant 

range  of  u.  This  way  the  dissipation  is  added  in  each  field  according  to  the 

maximum  size  of  eigenvalues  in  that  field,  not  globally.  One  could  also  use 

other  flux  splittings,  such  as  the  van  Leer  splitting  for  gas  dynamics  [101]. 

However,  for  higher  order  schemes,  the  flux  splitting  must  be  sufficiently 

smooth  in  order  to  retain  the  order  of  accuracy. 

With  these  flux  splittings,  we  can  again  use  the  scalar  recipes  to  form  the 

finite  difference  scheme:  just  compute  the  positive  and  negative  fluxes  /+  , 

l+ 2 

and  f~  j  component  by  component. 

l-V  2 

We  summarize  the  procedure  in  the  following 

Algorithm  4.5.  Component-wise  finite  difference  ID  system  ENO 
and  WENO  schemes. 

1.  Find  a  flux  splitting  (4.11).  The  simplest  example  is  the  Lax-Friedrichs 
flux  splitting  (4.13),  with  a  given  by  (4.26); 

2.  For  each  component  of  the  solution  u,  apply  the  scalar  Algorithm  4.3  to 
reconstruct  the  corresponding  component  of  the  numerical  flux  /i+i; 

3.  Form  the  scheme  (4.9). 

□ 

These  component  by  component  versions  of  ENO  and  WENO  schemes 
are  simple  and  cost  effective.  They  work  reasonably  well  for  many  problems, 
especially  when  the  order  of  accuracy  is  not  high  (second  or  sometimes  third 
order).  However,  for  more  demanding  test  problems,  or  when  the  order  of 
accuracy  is  high,  it  is  usually  advisable  to  use  the  following  more  costly,  but 
much  more  robust  characteristic  decompositions. 

To  explain  the  characteristic  decomposition,  we  start  with  a  simple  ex¬ 
ample  where  f(u)  =  Au  in  (4.1)  is  linear  and  A  is  a  constant  matrix.  In 
this  situation,  the  eigenvalues  (4.20),  the  eigenvectors  (4.21),  and  the  related 
matrices  R,  Rr1  and  A  (4.22)-(4.23),  are  all  constant  matrices.  If  we  define 
a  change  of  variable 

v  =  R _1  u,  (4.28) 

then  the  PDE  (4.1)  becomes  diagonal: 

vt  +  Avx  =  0  (4.29) 

that  is,  the  m  equations  in  (4.29)  are  decoupled  and  each  one  is  a  scalar  linear 
convection  equation  of  the  form 

wt  +  XjWx  =  0.  (4.30) 

We  can  thus  use  the  reconstruction  or  flux  evaluation  techniques  for  the  scalar 
equations,  discussed  in  Sections  4.1  and  4.2,  to  handle  each  of  the  equations 
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in  (4.30).  After  we  obtain  the  results,  we  can  “come  back”  to  the  physical 
space  u  by  using  the  inverse  of  (4.28): 


u  =  Rv  (4-31) 

For  example,  if  the  reconstructed  polynomial  for  each  component  j  in  (4.29) 
is  denoted  by  qj(x),  then  we  form 


(  qi(x)  > 

q{x)  =  ■  (4.32) 

\qm(x)  J 

and  obtain  the  reconstruction  in  the  physical  space  by  using  (4.31): 

p(x)  =  Rq(x)  (4.33) 


The  flux  evaluations  for  the  finite  difference  schemes  can  be  handled  similarly. 

We  now  come  to  the  situation  where  f'(u )  is  not  constant.  The  trouble 
is  that  now  all  the  matrices  R(u),  R~1(u)  and  A(u)  are  dependent  upon  u. 
We  must  “freeze”  them  locally  in  order  to  carry  out  a  similar  procedure  as  in 
the  constant  coefficient  case.  Thus,  to  compute  the  flux  at  the  cell  boundary 
xi+i,  we  would  need  an  approximation  to  the  Jocobian  at  the  middle  value 
iti+i .  This  can  be  simply  taken  as  the  arithmetic  mean 

ui+i  =  ^(ui  +  ui+i)  ,  (4.34) 

or  as  a  more  elaborate  average  satisfying  some  nice  properties,  e.g.  the  mean 
value  theorem 


/(«»+ 1)  -  /(««)  =  /'K+i)K+i  -  u») .  (4.35) 

Roe  average  [82]  is  such  an  example  for  the  compressible  Euler  equations  of 
gas  dynamics  and  some  other  physical  systems.  It  is  also  possible  to  use  two 
different  one-sided  Jacobians  at  a  higher  computational  cost  [28]. 

Once  we  have  this  ui+i,  we  will  use  R(ui+i),  R~l(ui+ 1)  and  A(ui+ 1)  to 
help  evaluating  the  numerical  flux  at  xi+i.  We  thus  omit  the  notation  i  +  \ 
and  still  denote  these  matrices  by  R,  R~1  and  A,  etc.  We  then  repeat  the 
procedure  described  above  for  linear  systems.  The  difference  here  being,  the 
matrices  R,  R_1  and  A  are  different  at  different  locations  xi+i,  hence  the 
cost  of  the  operation  is  greatly  increased. 

In  summary,  we  have  the  following  procedures: 

Algorithm  4.6.  Characteristic-wise  finite  volume  ID  ENO  and  WENO 
schemes. 
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1.  Compute  the  divided  or  undivided  differences  of  the  cell  averages  u,  for 

all  i ; 

2.  At  each  fixed  xi+i,  do  the  following: 

(a)  Compute  an  average  state  ui+ i ,  using  either  the  simple  mean  (4.34) 
or  a  Roe  average  satisfying  (4.35); 

(b)  Compute  the  right  eigenvectors,  the  left  eigenvectors,  and  the  eigen¬ 
values  of  the  Jacobian  f'(ui+ 1),  (4.20)-(4.23),  and  denote  them  by 

R  =  R(ui+ 1),  R~1=R~1(ui+ 1),  A  =  A(ui+ 1); 

(c)  Transform  all  those  differences  computed  in  Step  1,  which  are  in  the 
potential  stencil  of  the  ENO  and  WENO  reconstructions  for  obtaining 
u±  ! ,  to  the  local  characteristic  fields  by  using  (4.28).  For  example, 

Vj  =  i?_1  Uj,  j  in  a  neighborhood  of  i; 

(d)  Perform  the  scalar  ENO  or  WENO  reconstruction  Algorithm  4.1, 
for  each  component  of  the  characteristic  variables  v,  to  obtain  the 
corresponding  component  of  the  reconstruction  v±  , ; 

lAr  2 

(e)  Transform  back  into  physical  space  by  using  (4.31): 


i+ 


1 

2 


3. 


Apply  an  exact  or  approximate  Riemann  solver  to  compute  the  flux  At* 
for  all  i  in  (4.5);  then  form  the  scheme  (4.4). 

□ 


Similarly,  the  procedure  to  obtain  a  finite  difference  ENO-Roe  type  scheme 
using  the  local  characteristic  decomposition  is: 

Algorithm  4.7.  Characteristic- wise  finite  difference  ID  system,  Roe- 
type  schemes. 

1.  Compute  the  undivided  differences  of  the  flux  f(u)  for  all  i; 

2.  At  each  fixed  xi+i,  do  the  following: 

(a)  Compute  an  average  state  ui+i,  using  either  the  simple  mean  (4.34) 
or  a  Roe  average  satisfying  (4.35); 

(b)  Compute  the  right  eigenvectors,  the  left  eigenvectors,  and  the  eigen¬ 
values  of  the  Jacobian  f'(ui+ i),  (4.20)-(4.23),  and  denote  them  by 

R  =  R(ui+ 1),  R-1  =  R-\ui+i),  A  =  A{ui+ 1); 

(c)  Transform  all  those  differences  computed  in  Step  1,  which  are  in  the 
potential  stencil  of  the  ENO  and  WENO  reconstructions  for  obtaining 
the  flux  /if*.  to  the  local  characteristic  fields  by  using  (4.28).  For 
example, 


Vj  =  R  1  f(uj),  j  in  a  neighborhood  of  i; 
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(d)  Perform  the  scalar  ENO  or  WENO  Roe-type  Algorithm  4.2,  for  each 
component  of  the  characteristic  variables  v,  to  obtain  the  correspond¬ 
ing  component  of  the  flux  vi+i.  The  Roe  speed  ai+ i  is  replaced  by 
the  eigenvalue  \[(ui+i)  for  the  Z-th  component  of  the  characteristic 
variables  v, 

(e)  Transform  back  into  physical  space  by  using  (4.31): 

fi+L=Rvi+l 

3.  Form  the  scheme  (4.9). 

□ 

Finally,  the  procedure  to  obtain  a  finite  difference  flux  splitting  ENO  or 
WENO  scheme  using  the  local  characteristic  decomposition  is: 

Algorithm  4.8.  Characteristic-wise  finite  difference  ID  system,  flux 
splitting  schemes. 

1.  Compute  the  undivided  differences  of  the  flux  f(u)  and  the  solution  u  for 

all  i ; 

2.  At  each  fixed  xi+ i,  do  the  following: 

(a)  Compute  an  average  state  ui+ 1,  using  either  the  simple  mean  (4.34) 
or  a  Roe  average  satisfying  (4.35); 

(b)  Compute  the  right  eigenvectors,  the  left  eigenvectors,  and  the  eigen¬ 
values  of  the  Jacobian  f'(ui+ 1),  (4.20)-(4.23),  and  denote  them  by 

R  =  R(ui+ 1),  R~1=R~1(ui+ 1),  A  =  A(ui+i)-, 

(c)  Transform  all  those  differences  computed  in  Step  1,  which  are  in  the 
potential  stencil  of  the  ENO  and  WENO  reconstructions  for  obtaining 
the  flux  fi+ 1,  to  the  local  characteristic  fields  by  using  (4.28).  For 
example, 

Vj  —  R~l  Uj,  gj  =  R_1  j  in  a  neighborhood  of  i; 

(d)  Perform  the  scalar  flux  splitting  ENO  or  WENO  Algorithm  4.3,  for 
each  component  of  the  characteristic  variables,  to  obtain  the  corre¬ 
sponding  component  of  the  flux  <)*  , .  For  the  most  commonly  used 

l+2 

Lax-Friedrichs  flux  splitting,  we  can  use,  for  the  Z-th  component  of 
the  characteristic  variables,  the  viscosity  coefficient 

ol—  max  |Ai(it,-)|; 
l<j<N  1  1  1 

Local  Lax  Friedrichs  flux  splitting  can  also  be  used  here,  when  a  is 
chosen  as  a  maximum  of  |A((u,)|  and  |A;(uj+i)|,  plus  perhaps  several 
other  neighbors,  rather  than  as  a  maximum  over  the  whole  domain. 
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(e)  Transform  back  into  physical  space  by  using  (4.31): 


3.  Form  the  flux  by  taking 


and  then  form  the  scheme  (4.9). 


□ 


There  are  attempts  recently  to  simplify  this  characteristic  decomposition. 
For  example,  for  the  compressible  Euler  equations  of  gas  dynamics,  Jiang  and 
Shu  [55]  used  smooth  indicators  based  on  density  and  pressure  to  perform 
the  so-called  pseudo  characteristic  decompositions.  There  are  also  second  and 
sometimes  third  order  component  ENO  type  schemes  [75],  [70],  with  limited 
success  for  higher  order  methods. 


4.5  Boundary  Conditions 

For  periodic  boundary  conditions,  or  problems  with  compact  support  for  the 
entire  computation  (not  just  the  initial  data),  there  is  no  difficulty  in  imple¬ 
menting  boundary  conditions:  one  simply  set  as  many  ghost  points  as  needed 
using  either  the  periodicity  condition  or  the  compactness  of  the  solution. 

Other  types  of  boundary  conditions  should  be  handled  according  to  their 
type:  for  reflective  or  symmetry  boundary  conditions,  one  would  set  as  many 
ghost  points  as  needed,  then  use  the  symmetry/antisymmetry  properties  to 
prescribe  solution  values  at  those  ghost  points.  For  inflow  or  partially  in¬ 
flow  (e.g.  a  subsonic  outflow  where  one  of  the  characteristic  waves  flows  in) 
boundary  conditions,  one  would  usually  use  the  physical  inflow  boundary 
condition  at  the  exact  boundary  (for  example,  if  xi  is  the  left  boundary  and 
a  finite  volume  scheme  is  used,  one  would  use  the  given  boundary  value  u\, 

as  u\  in  the  monotone  flux  at  xi;  if  xo  is  the  left  boundary  and  a  finite 
2  2 

difference  scheme  is  used,  one  would  use  the  given  boundary  value  14  as  uq). 
Apart  from  that,  the  most  natural  way  of  treating  boundary  conditions  for 
the  ENO  scheme  is  to  use  only  the  available  values  inside  the  computational 
domain  when  choosing  the  stencil.  In  other  words,  only  stencils  completely 
contained  inside  the  computational  domain  is  used  in  the  ENO  stencil  choos¬ 
ing  process  described  in  the  previous  algorithms.  In  practical  implementation, 
in  order  to  avoid  logical  structures  to  distinguish  whether  a  given  stencil  is 
completely  inside  the  computational  domain,  one  could  set  all  the  ghost  val¬ 
ues  outside  the  computational  domain  to  be  very  large  with  large  variations 
(e.g.  setting  u-j  =  (10j)10  if  x_j,  for  j  =  1, 2, ...,  are  ghost  points).  This  way 
the  ENO  stencil  choosing  procedure  will  automatically  avoid  choosing  any 
stencil  containing  ghost  points.  Another  way  of  treating  boundary  conditions 
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is  to  use  extrapolation  of  suitable  order  to  set  the  values  of  the  solution  in 
all  necessary  ghost  points.  For  scalar  problems  this  is  actually  equivalent  to 
the  approach  of  using  only  the  stencils  inside  the  computational  domain  in 
the  ENO  procedure.  WENO  can  be  handled  in  a  similar  fashion. 

Stability  analysis  (GKS  analysis  [39],  [98])  can  be  used  to  study  the  linear 
stability  when  the  boundary  treatment  described  above  is  applied  to  a  fixed 
stencil  upwind  biased  scheme.  For  most  practical  situations  the  schemes  are 
linearly  stable  [3]. 


5  Reconstruction  and  Approximation  in  Multi 
Dimensions 


In  this  section  we  describe  how  the  ideas  of  reconstruction  and  approximation 
in  Sect.  2  are  generalized  to  multi  space  dimensions.  We  will  concentrate  our 
discussion  in  2D,  although  things  carry  over  to  higher  dimensions  as  well. 

In  the  first  two  subsections  we  will  consider  Cartesian  grids,  that  is,  the 
domain  is  a  rectangle 

[a,b]  x  [c,d\  (5.1) 

covered  by  cells 


Iij  — 


*t+i]  X  [yj-i,yj+L],  1  <i<Nx,  l<j<Ny  (5.2) 


where 


a  —  xx  <  Xi  <  ...  <  xNx_i  <  xNi+i  —  b, 


and 

c  =  yi  <  j/|  <  ...  <  yNv_ i  <  yNy+i  =  d. 
The  centers  of  the  cells  are 


(xi>yj)i  xi  —  2 

’  Vi  =  \{vi-i+  yj+i)  ’ 

(5.3) 

and  we  still  use 

Axi  =  xi+i  —  xi—i, 

i  =  1, 2, ...,  Nx 

(5.4) 

and 

AVj  =Vj+\  J  =  1,2, ...,Ny 

to  denote  the  grid  sizes.  We  denote  the  maximum  grid  sizes  by 

(5.5) 

Ax  =  max  Axi, 

1  <i<Nx 

Ay  =  max  Ay,, 

1  <j<Ny 

(5.6) 

and  assume  that  Ax  and  Ay  are  of  the  same  magnitude  (their  ratio  is 
bounded  from  above  and  below  during  refinement).  Finally, 


A  =  max(/3a;,  Ay). 


(5.7) 
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5.1  Reconstruction  from  Cell  Averages  —  Rectangular  Case 

The  approximation  problem  we  will  face,  in  solving  hyperbolic  conservation 
laws  using  cell  averages  (finite  volume  schemes,  see  Sect.  7.1),  is  still  the 
following  reconstruction  problem. 


Problem  5.1.  Two  dimensional  reconstruction  for  rectangles. 

Given  the  cell  averages  of  a  function  v(x,y): 


Ax 


1  Pi* 4  r+iv^,ri)d^dV,  (5.8) 

lAlJj  Jy._^  J x. 

i  =  1,2, Nx,  j =  1, 2, ..., Ny, 


find  a  polynomial  Pij{x,y),  preferably  of  degree  at  most  k  —  1,  for  each  cell 
Iij ,  such  that  it  is  a  7-th  order  accurate  approximation  to  the  function  v(x,  y) 
inside  7^: 

Pij(x,y)  =  v(x,y)  +  0(Ak),  (x,y)  €  Jy,  (5.9) 


In  particular,  this  gives  approximations  to  the  function  v(x,y)  at  the  cell 
boundaries 


vi+iiy=Pij(xi+i,y), 

l— 

i  y  —  Pij{Xi-%,V), 

7  =  1,- 

■  ■ )  Nx , 

Vi-  i  <  v  <  Vj+i 

\j+x  =Pij(x,yj+i), 

v+ . 

X,J 

_1  =  Pij(x!  J/j- i)) 

j  =  1,  • 

N 

..,  iVy, 

*i- i  <x<xi+L 

which  are  fc-th  order  accurate: 

=v{xi+x,y)  +  0{Ak),  i  =  0,l,...,Nx,  y^i  <y  <yj+i 

and 

=v(x,yj+L)  +  0{Ak),  j  =  0, 1,  ...,Ny,  x(_i  <  x  <  xi+x  . 


(5.10) 

(5.11) 
□ 


Again  we  will  not  discuss  boundary  conditions  in  this  section.  We  thus 
assume  that  Vij  is  also  available  for  i  <  0,  i  >  Nx  and  for  j  <  0,  j  >  Ny  if 
needed. 

In  the  following  we  describe  a  general  procedure  to  solve  Problem  5.1. 

Given  the  location  Jy  and  the  order  of  accuracy  k,  we  again  first  choose  a 
“stencil” ,  based  on  !Ah±ll  neighboring  cells,  the  collection  of  these  cells  still 
being  denoted  by  S(i,j).  We  then  try  to  find  a  polynomial  of  degree  at  most 
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k  —  1,  denoted  by  p{x,y )  (we  again  drop  the  subscript  ij  when  it  does  not 
cause  confusion),  whose  cell  average  in  each  of  the  cells  in  S(i,j)  agrees  with 
that  of  v(x,y): 

1  rym+ 1  rxi+ 1 

-7 — -7 —  /  /  p(z,v)d£dv  =  Vim ,  if  Iim  £  s(i,j).  (5.12) 

AXlAym  JXi_h 

We  first  remark  that  there  are  now  many  more  candidate  stencils  S(i,j) 
than  in  the  ID  case,  More  importantly,  unlike  in  the  ID  case,  here  we  en¬ 
counter  the  following  essential  difficulties: 

-  Not  all  of  the  candidate  stencils  can  be  used  to  obtain  a  polynomial 
p(x,y )  of  degree  at  most  k  —  1  satisfying  condition  (5.12). 

For  example,  it  is  an  easy  exercise  to  show  that  neither  existence  nor 
uniqueness  holds,  if  one  wants  to  reconstruct  a  first  degree  polynomial 
p(x,y )  satisfying  (5.12)  for  the  three  horizontal  cells 

‘-’(bj)  {^t—  l,ji  liji  • 

To  see  this,  let’s  assume  that 

Ii-u  =  [-24,-4]  X  [0,4\],  Iij  =  [-A0]  X  [0,  A],  li+ij  =  [0,A]  x  [0,4], 

and  the  first  degree  polynomial  p(x,  y)  is  given  by 

p{x ,  y)  =  a  +  fix  +  jy 

then  condition  (5.12)  implies 

fa  -  |  A/3  +  ^Aj-Vi- i,j 
<  a  -  lAP+  ^Aj  =  vi:j 
[a  +  lAP+lA'j  =  vi+hj 

which  is  a  singular  linear  system  for  a,  (5  and  7. 

-  Even  if  one  obtains  such  a  polynomial  p(x,  y),  there  is  no  guarantee  that 
the  accuracy  conditions  (5.9)  will  hold.  We  again  use  the  same  simple 
example.  If  we  pick  the  function 

v(x,y)  =  0, 

then  one  of  the  polynomials  of  degree  one  satisfying  the  condition  (5.12) 
is 

p(x,y)  =  A -2y 

clearly  the  difference 


v(x,0)  —p(x,  0)  =  —  A 

is  not  at  the  size  of  0(A2)  in  xt_i  <  x  <  xi+i,  as  is  required  by  (5.9). 


High  Order  ENO  and  WENO  Schemes  for  CFD  483 


This  difficulty  will  be  more  profound  for  unstructured  meshes  such  as 
triangles.  See,  for  example,  [1],  and  Sect.  5.3. 

For  rectangular  meshes,  if  we  use  the  tensor  products  of  ID  polynomials, 
i.e.  use  polynomials  in  Qk~l\ 


k- 1  fc-l 

p{x,y)  =  EE  y 

m=0  1=0 

then  things  can  proceed  as  in  ID.  We  restrict  ourselves  in  the  following  tensor 
product  stencils: 

Srs(i ,  j)  =  {Iim  :i  —  r  <  l  <  i  +  k  —  1—  r,  j  —  s<m<j  +  k  —  1  —  s) 


then  we  can  address  Problem  5.1  by  introducing  the  two  dimensional  primi¬ 
tives: 


V{x,y) 


v(Z,V  )d£dr). 


Clearly 


/2/j+l  fxi+±  J  1 

2  /  2  v(£,T])d£dri  =  ^  ^  vimAxiAym , 

m=-ool=-oo 

hence  as  in  the  ID  case,  with  the  knowledge  of  the  cell  averages  v  we  know 
the  primitive  function  V  exactly  at  cell  corners. 

On  a  tensor  product  stencil 


§rs{i,j)  =  {(xl+i,ym+ 1)  :  i-r-1  <  l  <  i+k-l-r,j-s-l  <  m<j+k-l-s} 

there  is  a  unique  polynomial  P(x,y)  in  Qk  which  interpolates  V  at  every 
point  in  Srs(i,j).  We  take  the  mixed  derivative  of  the  polynomial  P  to  get: 


p(x,y) 


d2P(x,  y) 
dxdy 


then  p(x,y )  is  in  Qk  1 ,  approximates  v(x,  y),  which  is  the  mixed  derivative 
of  V(x,y),  to  fc-th  order: 

v(x,y)  -p{x,y)  =  0(Ak) 

and  also  satisfies  (5.12): 


_ 1 

AxiAym 


p(£,  V)  dr) 


1 

AxtAym 


fVm+h  fx‘+h  &P_ 

'ym-h  J*,-* 


(£,  P)  dr) 
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=  Ax?Aym  (P^+i.J/m+i)  -  P(xi+i,ym-i) 

-P(a:,_i,3/m+i)  +  P(x,_i,i/m_i)) 

=  A -  n*I+f,i/m-i) 


AxiAy 


1  rxi+i 

-t—  /  /  vi^didy 

AVmJym_h  htr 


Vlr 


i  —  r  <  l  <  i  +  k  —  1—  r, 
j  —  s<m<j  +  k  —  1—  s. 

There  is  a  practical  way  to  perforin  the  reconstruction  in  2D.  We  first 
perform  a  one  dimensional  reconstruction  (Problem  2.1),  say  in  the  y  direc¬ 
tion,  obtaining  one  dimensional  cell  averages  of  the  function  v  in  the  other 
direction  (say  in  the  x  direction).  We  then  perform  a  reconstruction  in  the 
other  direction.  Notice  that  if  ENO  is  used  in  each  direction,  the  effective 
two  dimensional  stencil  may  not  be  a  tensor  product. 

It  should  be  remarked  that  the  cost  to  do  this  2D  reconstruction  is  very 
high:  for  each  grid  point,  if  the  cost  to  perform  a  one  dimensional  reconstruc¬ 
tion  is  c,  then  we  need  2c  per  grid  point  to  perform  this  2D  reconstruction. 
In  general  n  space  dimensions,  the  cost  grows  to  nc. 

We  also  remark  that  to  use  polynomials  in  Qk  l  is  a  waste:  to  get  the 
correct  order  of  accuracy  only  polynomials  in  Pfc_1  is  needed.  However,  there 
is  no  natural  way  of  utilizing  polynomials  in  Pfc_1  (see  the  comments  above, 
the  paper  of  Abgrall  [1],  and  Sect.  5.3). 

The  reconstruction  problem,  Problem  5.1,  can  also  be  raised  for  general, 
non-Cartesian  meshes,  such  as  triangles.  However,  the  solution  becomes  much 
more  complicated.  For  discussions,  see  for  example  [1]  and  Sect.  5.3. 


5.2  Conservative  Approximation  to  the  Derivative  from  Point 
Values 

The  second  approximation  problem  we  will  face,  in  solving  hyperbolic  con¬ 
servation  laws  using  point  values  (finite  difference  schemes,  see  Sect.  7.2),  is 
again  the  following  problem  in  obtaining  high  order  conservative  approxima¬ 
tion  to  the  derivative  from  point  values  [89,90].  As  in  the  ID  case,  here  we 
also  assume  that  the  grid  is  uniform  in  each  direction.  We  again  ignore  the 
boundary  conditions  and  assume  that  Vij  is  available  for  i  <  0  and  i  >  Nx, 
and  for  j  <  0  and  j  >  Ny. 

Problem  5.2.  Two  dimensional  conservative  approximation  to  the 
derivatives. 
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Given  the  point  values  of  a  function  v(x,y): 


Vij  =v(Xi,yj ), 

i  —  1, 2, ...,  Nx, 

j  —  1,  2,  ...,  Ny, 

(5.13) 

find  numerical  flux  functions 

Vi+  i , j  =  v(vi-r,j  J  • 

i  =  0,1, ...,  Nx 

(5.14) 

and 

vij+i  =  v(Vi,j-s,  • 

Vi,j+/c— 1— s)j 

sT 

T— 1 

o 

II 

(5.15) 

such  that  the  flux  differences  approximate  the  derivatives  vx  (x,  y)  and  vy  (x,  y) 
to  A;-th  order  accuracy: 


2^  =vx(xi,yj)  +  0(Axk),  i  =  0,l,...,Nx,  (5.16) 

and 

^  =vy(xi,yj)+0(Ayk),  j  =  0, 1, Ny,  (5.17) 

□ 

The  solution  of  this  problem  is  essential  for  the  high  order  conservative 
schemes  based  on  point  values  (finite  difference)  rather  than  on  cell  averages 
(finite  volume). 

Having  seen  the  complication  of  reconstructions  in  the  previous  subsec¬ 
tion  for  multi  space  dimensions,  it  is  a  good  relieve  to  see  that  conservative 
approximation  to  the  derivative  from  point  values  is  as  simple  in  multi  di¬ 
mensions  as  in  ID.  In  fact,  for  fixed  j,  if  we  take 

w(x)  =  v(x,yj) 

then  to  obtain  vx  (xi ,  yj )  =  w'(xi)  we  only  need  to  perform  the  one  dimen¬ 
sional  procedure  in  Sect.  2.2,  Problem  2.2,  to  the  one  dimensional  function 
w(x).  Same  thing  for  vy(x,y). 

As  in  the  ID  case,  the  conservative  approximation  to  derivatives,  of  third 
order  accuracy  or  higher,  can  only  be  applied  to  uniform  or  smoothly  varying 
meshes  (curvilinear  coordinates).  It  cannot  be  applied  to  general  unstructured 
meshes  such  as  triangles,  unless  conservation  is  given  up. 


5.3  Reconstruction  from  Cell  Averages  —  Triangular  Case 

Assuming  that  we  have  a  triangulation  with  N  triangles 


{Aq,  Ai,...,  Ajv}, 


(5.18) 
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the  reconstruction  problem  similar  to  Problem  5.1,  which  we  will  face,  in  solv¬ 
ing  hyperbolic  conservation  laws  using  cell  averages  (finite  volume  schemes, 
see  Sect.  7.1),  is  the  following: 

Problem  5.3.  Two  dimensional  reconstruction  for  triangles. 

Given  the  cell  averages  of  a  function  v(x,y ): 

=  \Ki\JA  i  =  l,2,...,N,  (5.19) 

here  |Aj|  is  the  area  of  the  triangle  A*,  find  a  polynomial  Pi(x,y),  of  degree 
at  most  k  -  1,  for  each  triangle  A*,  such  that  it  is  a  fc-th  order  accurate 
approximation  to  the  function  v(x,y)  inside  A*: 

Pi(x,y)  =  v(x,y)  +  0(Ak),  (x,y)  £  A*,  i  =  l,...,N.  (5.20) 

Here  we  again  use  A  to  denote  a  typical  length  of  the  triangles,  for  example 
the  longest  side  of  the  triangles.  □ 


Fig.  5.1.  A  typical  stencil 


In  particular,  (5.20)  gives  approximations  to  the  function  v(x,  y)  at  the 
triangle  boundaries,  which  are  needed  in  forming  the  finite  volume  schemes 
in  Sect.  7.1. 

Again  we  will  not  discuss  boundary  conditions  in  this  subsection.  We  thus 
assume  that  w*  is  also  available  for  triangles  A*  outside  the  boundary  of  the 
given  triangulation  if  needed. 

The  following  is  still  a  general  procedure  to  solve  Problem  5.3. 
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Given  the  location  A  j  and  the  order  of  accuracy  k,  we  again  first  choose  a 
“stencil”,  based  on  m  —  ^±11  neighboring  triangles,  the  collection  of  these 
triangles  being  denoted  by  S(i).  We  then  try  to  find  a  polynomial  of  degree 
at  most  k  —  1,  denoted  by  p(x,  y)  (we  again  drop  the  subscript  i  when  it  does 
not  cause  confusion),  whose  cell  average  in  each  of  the  triangle  in  S(i )  agrees 
with  that  of  v(x,y): 

rx-r  [  v(£,r))d£dr)  =  vj,  if  Aj  e  S(i).  (5.21) 

lAil  J  A,- 

Notice  that  (5.21)  will  give  us  a  m  x  m  linear  system.  If  this  linear  system 
has  a  unique  solution,  S(i)  is  called  an  admissible  stencil.  Of  course,  in  prac¬ 
tice,  we  also  have  to  worry  about  any  ill  conditioned  linear  system  even  if 
it  is  invertible.  For  k  =  1,  a  stencil  formed  by  A;  itself  plus  two  immediate 
neighboring  triangles  is  admissible  for  most  triangulations.  Thus  second  or¬ 
der  reconstruction  is  quite  easy.  We  emphasize  here  that  when  we  talk  about 
order  of  accuracy  in  this  section  it  applies  only  on  the  approximation  level, 
and  also  only  for  “reasonable”  triangulations.  We  will  not  go  into  the  details 
of  classifying  such  triangulations. 

For  a  third  order  reconstruction  we  need  a  quadratic  polynomial  (k  =  2), 
which  has  m  =  6  degrees  of  freedom.  This  time,  some  of  the  stencils  consisting 
of  A i  and  5  of  its  neighbors  may  not  be  admissible.  It  seems  that  the  most 
robust  way  is  the  least  square  reconstruction  procedure  suggested  by  Barth 
and  Frederickson  [7].  For  the  control  volume  triangle  A0  (see  Fig.  5.1),  let 
A  i ,  A  j  ,  A*  be  its  three  neighbors,  and  Aia  ,  A  a  be  the  two  neighbors  (other 
than  A0)  of  Aj,  and  so  on,  we  determine  the  quadratic  polynomial  p2  by 
requiring  that  p2  has  the  same  cell  average  as  v  on  A0, ,  and  also  p2  has  the 
same  cell  average  as  v  on 

{Aj,  Aia,  A  ifc,  A  j,  Aja,  Ajb,  A k,  Aka,  A  kb  }, 

but  only  in  a  least-square  sense  (as  this  is  an  over-determined  system) .  Notice 
that  some  of  the  neighbors’  neighbors  (Aia,  Ajj,  A ja, ...)  may  coincide.  For 
example,  A if>  might  be  the  same  as  Aja.  This,  however,  does  not  affect  the 
least  square  procedure  to  determine  p2 . 

For  a  fourth  order  reconstruction  we  need  a  cubic  polynomial  ( k  =  3), 
which  has  m  =  10  degrees  of  freedom.  If  we  only  consider  the  case  where 
ia,ib,ja,jb,ka,kb  are  distinct  in  the  stencil  (see  Fig.  5.1),  it  seems  that  we 
can  construct  the  cubic  polynomial  p 3  by  requiring  that  its  cell  average  agrees 
with  that  of  v  on  each  triangle  in  the  10-triangle  stencil  shown  in  Fig.  5.1, 
for  most  triangulations. 

6  ENO  and  WENO  Reconstruction  and  Approximation 
in  Multi  Dimensions 

For  solving  hyperbolic  conservation  laws  in  multi  space  dimensions,  we  are 
again  interested  in  the  class  of  piecewise  smooth  functions.  We  define  a  piece- 
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wise  smooth  function  v(x,  y)  to  be  such  that,  for  each  fixed  y,  the  one  dimen¬ 
sional  function  w(x)  =  v(x,  y)  is  piecewise  smooth  in  the  sense  described  in 
Sect.  3.  Likewise,  for  each  fixed  x,  the  one  dimensional  function  w(y)  =  v(x,  y) 
is  also  assumed  to  be  piecewise  smooth.  Such  functions  are  again  “generic” 
for  solutions  to  multi  dimensional  hyperbolic  conservation  laws  in  practice. 

In  the  previous  section,  we  have  already  discussed  the  problems  of  re¬ 
construction  and  conservative  approximations  to  derivatives  in  multi  space 
dimensions.  For  structured  meshes,  both  the  reconstruction  and  the  conser¬ 
vative  approximation  can  be  obtained  from  one  dimensional  procedures.  For 
unstructured  meshes,  the  procedure  has  to  be  truly  two  dimensional. 


6.1  Structured  Meshes 

For  a  rectangular  mesh,  we  can  proceed  using  the  one  dimensional  results. 
For  the  reconstruction,  we  first  use  a  one  dimensional  ENO  or  WENO  re¬ 
construction  procedure,  Algorithm  3.1  or  3.2,  on  the  two  dimensional  cell 
averages,  say  in  the  y  direction,  to  obtain  one  dimensional  cell  averages  in  x 
only.  Then,  another  one  dimensional  reconstruction  in  the  remaining  direc¬ 
tion,  say  in  the  x  direction,  is  performed  to  recover  the  function  itself,  again 
using  the  one  dimensional  ENO  or  WENO  methodology,  Algorithm  3.1  or 
3.2. 

For  the  conservative  approximation  to  derivatives,  since  they  are  already 
formulated  in  a  dimension  by  dimension  fashion,  one  dimensional  ENO  and 
WENO  procedures  can  be  trivially  applied.  In  effect,  the  FORTRAN  program 
for  the  2D  problem  is  the  same  as  the  one  for  the  ID  problem,  with  an  outside 
“do  loop” . 

What  happens  to  general  geometry  which  cannot  be  covered  by  a  Carte¬ 
sian  grid? 

If  the  domain  is  smooth  enough,  it  usually  can  be  mapped  smoothly  to  a 
rectangle  (or  at  least  to  a  union  of  non-overlapping  rectangles).  That  is,  the 
transformation 

£  =  £(z,y),  V  =  v(x,y)  (6.1) 

maps  the  physical  domain  Q  where  ( x,y )  belongs,  to  a  rectangular  compu¬ 
tational  domain 

a  <  £  <b,  c<r]<d.  (6.2) 

We  require  the  transformation  functions  (6.1)  to  be  smooth  (i.e.  it  has  as 
many  derivatives  as  the  accuracy  of  the  scheme  calls  for).  Using  chain  rule, 
we  could  write,  for  example, 


vx  =  £xv(  +  r)xvn  (6.3) 

We  can  then  use  our  ENO  or  WENO  approximations  on  v £  and  vv,  as  they 
are  now  defined  in  rectangular  domains.  The  smoothness  of  £x  and  T]x  will 
guarantee  that  this  leads  to  a  high  order  approximation  to  vx  as  well  through 
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(6.3).  It  is  proven  in  [77]  that  this  way  the  scheme  is  still  conservative,  i.e. 
Lax-Wendroff  theorem  [65]  still  applies.  For  Euler  equations  of  gas  dynamics 
or  other  homogeneous  of  degree  zero  systems,  it  is  also  possible  to  write  the 
system  in  the  new  £  and  r]  variables  as  a  strongly  conservative  system,  see 
[77]. 

If  the  domain  is  really  ugly,  or  if  one  wants  to  use  unstructured  meshes  for 
other  purposes  (e.g.  for  adaptivity),  then  ENO  and  WENO  approximations 
for  unstructured  meshes  must  be  studied.  This  will  be  discussed  briefly  in 
the  next  subsection. 


6.2  Unstructured  Meshes 

For  unstructured  meshes  a  truly  two  dimensional  ENO  or  WENO  reconstruc¬ 
tion  must  be  carried  out.  We  will  present  here  one  approach,  adopted  by  Hu 
and  Shu  in  [49],  [50],  for  third  and  fourth  order  WENO  reconstructions.  Al¬ 
ternative  (lower  order)  WENO  reconstruction  procedures  can  also  be  found 
in  [32],  For  an  ENO  reconstruction  procedure,  we  refer  the  readers  to  [1]  and 
[97]. 

We  start  with  the  third  order  reconstruction.  A  key  step  in  building  a  high 
order  WENO  scheme  based  on  lower  order  polynomials  is  carried  out  in  the 
following.  We  want  to  construct  several  linear  polynomials  whose  weighted 
average  will  give  the  same  result  as  the  quadratic  reconstruction  p2  at  each 
quadrature  point  (the  weights  are  different  for  different  quadrature  points). 
Referring  to  Fig.  5.1,  we  can  build  the  following  9  linear  polynomials  by 
agreeing  with  the  cell  averages  of  v  on  the  following  stencils:  pi  on  triangles 
0,  j,  k,  p2  on  triangles  0,  k,  i,  p3  on  triangles  0,  i,  j,  p4  on  triangles  0,  i,  ia,  p5  on 
triangles  0,  i,  ib,  p6  on  triangles  0,  j,  ja,  pr  on  triangles  0,  j,  jb,  p8  on  triangles 
0 ,k,ka,  and  pg  on  triangles  0,  k,  kb.  For  each  quadrature  point  (xG,yG),  we 
want  to  find  the  linear  weights  js,  such  that  the  linear  polynomial  obtained 
from  a  linear  combination  of  these  ps 


9 

-R(z,  y)  =  Yh  y )  (6-4) 

S=1 

satisfies 

R(xG,yG)  =  p2(xG,yG)  (6.5) 

where  p2  is  defined  before  in  Sect.  5.3  using  the  least  squares  procedure,  for 
arbitrary  choices  of  cell  averages 

{u0i  “Uj ,  11  k ,  Uja,  Uifrj  Uja ,  Ujl:  Ukai  (6.6) 

Since  both  the  left  side  and  the  right  side  of  the  equality  (6.5)  are  linear  in 
the  cell  averages  (6.6),  for  the  equality  to  hold  for  arbitrary  u’s  in  (6.6)  one 
must  have  all  10  coefficients  of  the  u’s  to  be  identically  zero  (when  all  terms 
are  moved  to  one  side  of  the  equality),  which  leads  to  10  linear  equations  for 
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the  nine  weights  js.  This  looks  like  an  over-determined  system,  but  is  in  fact 
under-determined  of  rank  8,  allowing  for  one  degree  of  freedom  in  the  choice 
of  the  nine  7S. 

Before  explaining  this,  we  first  look  at  a  simpler  but  illustrative  one  di¬ 
mensional  example.  Let  us  denote  Ij,  j  =  0, 1,2,  as  three  equal  sized  consec¬ 
utive  intervals.  The  two  linear  polynomials  ps,  where  pi  agrees  with  u  on  cell 
averages  in  the  intervals  Iq  and  7i ,  and  p2  agrees  with  it  on  cell  averages  in 
the  intervals  7i  and  72,  give  the  following  two  second  order  approximations 
to  the  value  of  u  at  the  point  Xi  (the  boundary  of  7i  and  I 2)'. 

1-3  11 

~2U° +  2Uu  2Ul  +  2U2'  V6-7) 

The  quadratic  polynomial  p2,  which  agrees  with  u  on  cell  averages  in  the 
intervals  Io,  I\  and  I2,  gives  the  following  third  order  approximation  to  the 
value  of  u  at  the  point  a;  a: 


~~U  0  +  +  TWl- 

6  6  3 

We  would  like  to  find  7S  such  that 


(6.8) 


/I  3  \  (\  1_  \  1  5  1  //>n, 

7i  (  “2“°  +  2Ul  J  +  72  \  2Ql  +  2U2)  =  ~6U°  +  6Ul  +  3Ul 

for  arbitrary  u' s.  This  leads  to  the  following  three  equations: 

1  13  15  11 

~271  —  6’  27l  +  272  _6’  2 72  —  3’ 

for  the  two  unknowns  71  and  72 .  It  looks  like  an  over-determined  system  but 
is  in  fact  rank  2  and  has  a  unique  solution 

1  2 
71  =  3’  72  =  S' 

The  reason  can  be  understood  if  we  ask  for  the  validity  of  the  equality  (6.9) 
in  the  cases  of  u  =  1,  u  =  x  and  u  =  x2.  Clearly  if  (6.9)  holds  in  these  three 
cases  then  it  holds  for  arbitrary  choices  of  it’s.  The  crucial  observation  is  that 
(6.9)  holds  for  both  u  =  1  and  u  =  x  as  long  as  71  +  72  =  1,  as  all  three 
expressions  in  (6.7)  and  (6.8)  reproduce  linear  functions  exactly.  Hence  the 
equality  (6.9)  is  valid  for  all  the  three  cases  u  =  1,  u  =  x  and  u  —  x2  with 
only  two  conditions:  71  +  72  =  1  and  another  one  obtained  when  u  =  x2, 
resulting  in  a  solvable  2x2  system  for  7,. 

The  same  argument  can  be  applied  in  the  current  two  dimensional  case. 
Although  there  are  10  linear  equations  for  the  nine  weights  js  resulting  from 
the  equality  (6.5),  we  should  notice  that  the  equality  (6.5)  is  valid  for  all 
three  cases  u  =  1,  u  —  x  and  u  =  y  under  only  one  constraint  on  js,  namely 
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X^=i7s  =  1>  again  because  ps(x)  and  p2(x)  all  reproduce  linear  functions 
exactly.  Thus  we  can  eliminate  two  equations  from  the  ten,  resulting  in  a 
rank  8  system  with  one  degree  of  freedom  in  the  solution  for  qs.  In  practice, 
we  obtain  the  solution  7S  for  s  >  2  with  71  as  the  degree  of  freedom. 

Note  that  there  are  situations  when  ia,ib,ja,jb,ka,kb  might  not  be  dis¬ 
tinct,  in  these  cases,  we  simply  discard  some  of  the  ps,  or  just  set  the  cor¬ 
responding  coefficient  to  zero.  For  example,  if  ib  —  ja,  we  will  just  use 
Pi , P-2 ,P3,Pi,P5,P7,Ps,P9  and  discard  p6.  In  this  case  there  is  one  fewer  coef¬ 
ficient  but  also  one  fewer  condition  to  satisfy  for  (6.5),  as  there  is  one  fewer 
triangle  in  the  stencil.  The  discussion  carried  out  above  still  applies. 

The  first  effort  we  would  like  to  make  is  to  use  this  degree  of  freedom  to 
obtain  a  set  of  non-negative  js,  which  is  important  for  the  WENO  procedure. 
Unfortunately,  it  turns  out  that,  for  many  triangulations,  this  is  impossible. 
Some  grouping  is  needed  and  is  discussed  next.  We  want  to  group  these  9 
linear  polynomials  into  3  groups: 

9  3 

'^2%Ps(x,y)  =  ^7 sPs{x,y), 

S=1  S=1 

each  ps(x,  y)  being  still  a  linear  polynomial  and  a  second  order  approximation 
to  u,  with  positive  coefficients  >  0.  We  also  require  the  stencils  correspond¬ 
ing  to  the  three  new  linear  polynomials  ps(x,y)  to  be  reasonably  separated, 
so  that  when  shocks  are  present,  not  all  stencils  will  contain  the  shock  under 
normal  situations. 

The  grouping  we  will  introduce  in  the  following  works  for  most  triangula¬ 
tions.  There  are  however  cases  when  it  does  give  some  negative  coefficients, 
especially  when  one  is  doing  adaptive  meshing  and  is  near  the  adaptively 
refined  regions  where  triangle  sizes  are  changing  very  abruptly.  In  such  cases 
one  would  need  to  use  a  Lax-Friedrichs  like  procedure,  namely  breaking  each 
coefficient  7S  =  2js  —  %  and  collecting  the  three  positive  terms  and  the 
three  negative  terms  separately  to  obtain  WENO  weights.  This  procedure 
is  currently  being  developed  by  Hu  and  Shu  and  have  been  performing  well 
numerically  in  our  preliminary  tests.  It  will  appear  in  a  future  publication. 
In  the  following  we  will  only  consider  those  triangulations  when  our  grouping 
strategy  will  produce  positive  weights. 

For  the  first  quadrature  point  on  side  i  (G\  in  Fig.  5.1),  Group  1  contains 
Pi  (0,k,i),  p4(0,i,ia),  and  p5  (0,i,ib), 

Pi  =  (72P2  +  74P4  +  75P5)/ (72  +  74  +  75),  7i  =  72  +  74  +  75, 
Group  2  contains  p3  (0,i,j),  p6  (0 and  p~  (0 ,j,jb), 

P2  =  {im  +  7 6P6  +  77P7)/(73  +  76  +  77),  72  =  73  +  76  +  77, 

Group  3  contains  p4  (0,  j,  k),  p8  (0,  k,  ka),  and  p9  (0,  k,  kb), 

P3  =  (71P1  +  IsPs  +  79P9)/(7i  +  7s  +  79),  73  =  71  +  7s  +  79- 
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The  resulting  linear  polynomial 


3 

R(x,  y)  =  ^2  isPs(x,  y)  (6.10) 

S=1 

is  identical  to  R(x,y)  in  (6.4)  and  in  most  cases  the  coefficients  7S  can  be 
made  non-negative  by  suitably  choosing  the  value  of  the  degree  of  freedom 
7i ,  through  the  solution  of  a  group  of  3  linear  inequalities  for  ji . 

We  remark  that  for  practical  implementation,  it  is  the  5  constants  a*, 
which  depend  on  the  local  geometry  only,  such  that 

pi(xGl  ,yQl)  =  ditto  +  a2Ui  +  a3uk  +  a4wia  +  a5uib,  (6.11) 

that  have  to  be  precomputed  and  stored  once  the  mesh  is  generated.  We  do 
not  need  to  store  any  information  about  the  polynomial  pi  itself. 

For  the  second  quadrature  point  on  side  i,  (G2  in  Fig.  5.1),  Group  1 
contains  P3(0,i,j),  Pa  (0,  i,ia),  and  p3  (0,  i,  ib),  with  the  combination  co¬ 
efficient  7i  =  73  +  74  4-  75;  Group  2  contains  p2  (0 ,k,i),  p3  (0 ,k,ka),  and 
pg  (0,  k,  kb)-,  with  the  combination  coefficient  72  =72  +  78  +  79!  Group  3  con¬ 
tains  pi  (0,  j,  k),  pe(0,j,ja),  and  p7  (0,j,jb);  with  combination  coefficient 
73  =  7i  +  76  +  77-  We  can  do  the  same  thing  for  the  other  two  sides  (j,  k). 

Next  we  describe  the  fourth  order  reconstruction.  Again,  the  key  step 
to  build  a  high  order  WENO  scheme  based  on  lower  order  polynomials  is 
carried  out  in  the  following.  We  would  like  to  construct  several  quadratic 
polynomials  whose  weighted  average  will  give  the  same  result  as  the  cubic 
reconstruction  p3,  which  was  described  in  Sect.  6.2,  at  each  quadrature  point 
(the  weights  are  different  for  different  quadrature  points).  The  following  6 
quadratic  polynomials  are  constructed  by  having  the  same  cell  averages  as  u 
on  the  corresponding  triangles: 

qi  (on  triangles:  0 ,i,ia,ib,k,kb),  q2  (on  triangles:  0,i,ia,ib,j,ja), 
q3  (on  triangles:  0 ,j,ja,jb,i,ib),  <74  (on  triangles:  0,j,ja,jb,k,ka), 

<75  (on  triangles:  0 ,k,ka,kb,j,jb),  qe  (on  triangles:  0,k,ka,kb,i,  ia). 

For  each  quadrature  point  (xG,yG),  we  would  like  to  find  the  linear 
weights  such  that  the  linear  combination  of  these  qs 


6 


Q(x,y)  = 

S= 1 

(6.12) 

satisfies 

for  all  u’s. 

Q(xG,yG)  =p3{xG,yG) 

(6.13) 

As  before,  (6.13)  results  in  10  linear  equations  for  the  6  unknowns  7S, 
which  are  the  coefficients  of  the  10  cell  averages  it’s  in  (6.6).  This  looks  like 
a  grossly  over-determined  system,  but  it  is  in  fact  under-determined  with 
rank  5,  thus  allowing  a  solution  for  qs  with  one  degree  of  freedom.  A  crucial 
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observation  is  again  that  (6.13)  is  valid  for  all  the  6  cases  u  =  1,  x,  y,  x2,  xy,  y2 
under  just  one  constraint  on  the  7 s,  namely  Yfs=i  7s  =  1)  because  qs{ x) 
and  p 3  ( x )  all  reproduce  quadratic  functions  exactly.  We  can  thus  eliminate  5 
equations  from  the  10,  resulting  in  a  rank  5  system  with  one  degree  of  freedom 
in  the  solution  for  7S.  In  practice,  we  obtain  the  solution  7S  for  s  >  2  with 
71  as  the  degree  of  freedom. 

Again,  the  first  effort  we  would  like  to  make  is  to  use  this  degree  of  freedom 
to  obtain  a  set  of  non-negative  7 s,  through  the  solution  of  a  group  of  5  linear 
inequalities  for  ji.  This  is  important  for  the  WENO  procedure.  Positivity 
seems  achievable  for  the  mostly  near-uniform  meshes  used  in  the  numerical 
examples.  For  general  triangulations  negative  coefficients  do  appear,  and  the 
investigation  of  using  the  Lax-Friedrichs  like  procedure  mentioned  above  for 
the  third  order  case  is  currently  undertaken. 

We  finally  come  to  the  point  of  smooth  indicators  and  nonlinear  weights. 
For  this  we  follow  exactly  as  in  Jiang  and  Shu  [55],  see  Sect.  3.2.  For  a 
polynomial  p(x,  y)  with  degree  up  to  n,  we  define  the  following  measurement 
for  smoothness 

S=  f  \&\M~1(Dap(x,y))2dxdy  (6.14) 

1  <H<«  A 


where  a  is  a  multi-index  and  D  is  the  derivative  operator,  for  example,  when 
a  —  (1, 2)  then  |a|  =  3  and  Dap(x,y)  =  dp~  I*’?) 
then  defined  as: 


■  QxQy%  ■  The  non-linear  weights  are 


Wi 


w,-  = 


£,<4 


i 


Ui  - 


7 i 


(e  +  Si)2 


(6.15) 


where  7 *  is  the  z-th  coefficient  in  the  linear  combination  of  polynomials  (i.e. 
the  7 s  in  (6.10)  for  the  third  order  case  and  the  %  in  (6.12)  for  the  fourth 
order  case),  Si  is  the  measurement  of  smoothness  of  the  z-th  polynomial 
Pi(x,y )  (i.e.  the  ps  in  (6.10)  for  the  third  order  case  and  the  qs  in  (6.12) 
for  the  fourth  order  case),  and  e  is  a  small  positive  number  which  we  take 
as  e  =  10~3  for  all  the  numerical  experiments  for  triangles.  The  numerical 
results  are  not  very  sensitive  to  the  choice  of  e  in  a  range  from  10“2  to 
10-6.  In  general,  larger  e  gives  better  accuracy  for  smooth  problems  but  may 
generate  small  oscillations  for  shocks.  Smaller  e  is  more  friendly  to  shocks. 
The  nonlinear  weights  uij  in  (6.15)  would  then  replace  the  linear  weights  7 j 
to  form  a  WENO  reconstruction. 

We  emphasize  that  the  smoothness  measurements  (6.14)  are  quadratic 
functions  of  the  cell  averages  in  the  stencil.  For  example,  it  is  the  10  constants 
hi  and  Cj,  which  depend  on  the  local  geometry  only,  such  that 


S  —  (biUo  +  b2Ui  +  b3Uk  +  b4Uia  +  b5Un,)2  +  (ciUo  +  C2lli  +  CsUk  +  C^Uia  +  C5Uib)2 

(6.16) 

for  the  smoothness  measurements  (6.14)  of  pi  in  (6.10),  that  have  to  be 
precomputed  and  stored  once  the  mesh  is  generated.  We  do  not  need  to  store 
any  information  about  the  polynomial  pi  itself. 
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7  ENO  and  WENO  Schemes  in  Multi  Dimensions 

In  this  section  we  describe  the  ENO  and  WENO  schemes  for  2D  conservation 
laws: 

ut(x,y,t)  +  fx(u(x,y,t))  +  gy{u(x,y,t))  =  0  (7.1) 

again  equipped  with  suitable  initial  and  boundary  conditions. 

Although  we  present  everything  in  2D,  most  of  the  discussion  is  also  valid 
for  higher  dimensions. 

We  again  concentrate  on  the  discussion  of  spatial  discretizations,  and  will 
leave  the  time  variable  t  continuous  (the  method-of-lines  approach).  Time 
discretization  will  be  discussed  in  Sect.  9. 

For  structured  meshes,  our  computational  domain  is  rectangular,  given 
by  (5.1).  In  such  cases  our  grids  will  be  Cartesian,  given  by  (5.2)  and  (5.3). 
For  unstructured  meshes,  we  assume  a  triangulation  consisting  of  triangles 

(5-18).  . . 

We  do  not  discuss  boundary  conditions  in  this  section.  We  thus  assume 
that  the  values  of  the  numerical  solution  are  also  available  outside  the  com¬ 
putational  domain  whenever  they  are  needed.  This  would  be  the  case  for  peri¬ 
odic  or  compactly  supported  problems.  Two  dimensional  boundary  condition 
treatments  are  similar  to  the  one  dimensional  case  discussed  in  Sect.  4.5. 


7.1  Finite  Volume  Formulation  in  the  Scalar  Case 

For  finite  volume  schemes,  or  schemes  based  on  cell  averages,  we  do  not  solve 
(7.1)  directly,  but  its  integrated  version.  For  a  structured  mesh,  we  integrate 
(7.1)  over  the  cell  Iij  to  obtain 


dujj(t)  _ 

dt 


-  ,  1A  ■  f  [  ’+i  f(u(xi+i,y,t))dy  -  f  ’+i  f(u(Xi_L,y,t))dy 

nxinyj  \  Jy._^  Jy 

fXi+k  fX*+h  I 

+  /  9(u(x,yj+i,t))dx  -  /  g(u(x,yj_i,t))dx  1  (7.2) 

J**-h  J 

Uij(t)  =  -1t—  [  J+i  [  +i  u{^,rj,t)d^dr]  (7.3) 

AxiAyj  Jy .  ^  JX{  i 

is  the  cell  average.  We  approximate  (7.2)  by  the  following  conservative  scheme 


where 


dilij  (t) 
dt 


~  Axi  Ay,  hi- 0’ 


(7.4) 


where  the  numerical  flux  fi+ y  is  defined  by 

fi+h,j  ~  u‘i+\,yj+0aAy 


,)• 


(7.5) 
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where  f3a  and  wa  are  Gaussian  quadrature  nodes  and  weights,  for  approxi¬ 
mating  the  integration  in  y: 


1  fyi+  £ 

—  J  f(u(xi+i,y,t))dy 

3  vi-\ 


inside  the  integral  form  of  the  PDE  (7.2),  and  ,  are  the  fc-th  order 

l"r  2  ’2/ 

accurate  reconstructed  values  obtained  by  ENO  or  WENO  reconstruction 
described  in  the  previous  section.  As  before,  the  superscripts  ±  imply  the 
values  are  obtained  within  the  cell  hj  (for  the  superscript  -)  and  the  cell 
li+ij  (for  the  superscript  +),  respectively.  The  flux  §ij+i  is  defined  similarly 

by 


ki+i  =  J2Wah{uXi 


+j3aAxi,j- t-i’ 


J’xi+Pc 


Axitj+ 


h)' 


(7.6) 


for  approximating  the  integration  in  x: 


x,yj+L,t))dx 


inside  the  integral  form  of  the  PDE  (7.2).  ,+1  are  again  the  k- th  order 

accurate  reconstructed  values  obtained  by  ENO  or  WENO  reconstruction 
described  in  the  previous  section,  h  is  again  a  one  dimensional  monotone 
flux,  examples  being  given  in  (4.6)-(4.8). 

We  summarize  the  procedure  to  build  a  finite  volume  ENO  or  WENO 
2D  scheme  (7.4)  on  structured  mesh,  given  the  cell  averages  {Tiij}  (we  again 
drop  the  explicit  reference  to  the  time  variable  t),  and  a  one  dimensional 
monotone  flux  h,  as  follows: 


Algorithm  7.1.  Finite  volume  2D  scalar  ENO  and  WENO  schemes 
for  a  rectangular  mesh. 

1.  Follow  the  procedures  described  in  Sect.  6.1,  to  obtain  ENO  or  WENO  re¬ 
constructed  values  at  the  Gaussian  points, 

u^~.  i  ,  n  a  and  u  0  .  . .  i . 

i+i’yj+PcAy-i  xi+0aAxi,]+i 

Notice  that  this  step  involves  two  one  dimensional  reconstructions,  each 
one  to  remove  a  one  dimensional  cell  average  in  one  of  the  two  directions. 
Also  notice  that  the  optimal  weights  used  in  the  WENO  reconstruction 
procedure  are  different  for  different  Gaussian  points  indexed  by  a; 

2.  Compute  the  flux  fi+ij  and  Qij+i  using  (7.5)  and  (7.6); 

3.  Form  the  scheme  (7.4). 

□ 


We  remark  that  the  finite  volume  scheme  in  2D,  as  described  above,  is 
very  expensive  due  to  the  following  reasons: 
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—  A  two  dimensional  reconstruction,  at  the  cost  of  two  one  dimensional 
reconstructions  per  grid  point,  is  needed.  For  general  n  space  dimensions, 
the  cost  becomes  n  one  dimensional  reconstructions  per  grid  point; 

—  More  than  one  quadrature  points  are  needed  in  formulating  the  flux  (7.5)- 
(7.6),  for  order  of  accuracy  higher  than  two.  Thus,  for  ENO,  although  the 
stencil  choosing  process  needs  to  be  done  only  once,  the  reconstruction 
(2.10)  has  to  be  done  for  each  quadrature  point  used  in  the  flux  for¬ 
mulation.  For  WENO,  the  optimal  weights  are  also  different  for  each 
quadrature  point.  This  becomes  much  more  costly  for  n  >  2  dimension, 
as  then  the  fluxes  are  defined  by  integrals  in  n  —  1  dimension  and  a  n  —  1 
dimensional  quadrature  rule  must  be  used. 


This  is  why  multidimensional  finite  volume  schemes  of  order  of  accu¬ 
racy  higher  than  2  are  rarely  used  for  structured  mesh.  For  2D,  based  on 
[43],  Casper  [14]  has  coded  up  a  fourth  order  finite  volume  ENO  scheme  for 
Cartesian  grids,  see  also  [15].  3D  finite  volume  ENO  code  of  order  of  accu¬ 
racy  higher  than  2  for  a  rectangular  mesh  does  not  exist  yet,  to  the  author’s 
knowledge.  A  finite  difference  version  to  be  described  in  Sect.  7.2  is  much 
more  economical  for  a  multidimensional  structured  mesh. 

At  the  second  order  level,  the  cost  is  greatly  reduced  because: 

-  There  is  no  need  to  perform  a  reconstruction,  as  the  cell  average  Uij  agrees 
with  the  point  value  at  the  center  u(xi,yj)  to  second  order  0(A2); 

-  The  quadrature  rule  in  defining  the  flux  (7.5)-(7.6)  needs  only  one  (mid) 
point. 

One  advantage  of  finite  volume  ENO  or  WENO  schemes  is  that  they  can 
be  defined  on  arbitrary  meshes,  provided  that  an  ENO  or  WENO  reconstruc¬ 
tion  on  that  mesh  is  available.  This  is  described  below.  See  also  [1]. 

Taking  the  triangle  A*  as  our  control  volume,  we  formulate  the  semi¬ 
discrete  finite  volume  scheme  for  equation  (7.1)  as: 


d_ 

dt 


F  -  nds  =  0 


(7.7) 


where  Ui(t)  is  the  cell  average  of  u  on  the  cell  A,,  F  =  ( f,g)T ,  n  is  the 
outward  unit  normal  of  the  triangle  boundary  <9Aj. 

The  line  integral  in  (7.7)  is  discretized  by  a  g-point  Gaussian  integration 
formula, 


L 


F  -  nds  «  lAI^FMG^)) 
j= 1 


(7.8) 


and  F(u(Gj,t))  ■  n  is  replaced  by  a  one  dimensional  numerical  flux  in  the  n 
direction.  We  can  for  example  use  any  one  of  (4.6)-(4.8).  The  simple  Lax- 
Friedrichs  flux  is  for  example  given  by 


F(u(Gj,t))-n  (7.9) 

«  \  +  F(u+(Gj,t)))  ■  n  -  a  (u+(Gj,t)  -  u-(Gj,t))] 
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where  a  is  taken  as  an  upper  bound  for  |P'(u)  ■  n\.  Here,  u~  and  u+  are  the 
values  of  the  reconstructed  values  of  u  inside  the  triangle  and  outside  the 
triangle  (inside  the  neighboring  triangle)  at  the  Gaussian  point,  see  Sect.  6.2. 

Since  we  are  constructing  schemes  up  to  fourth  order  accuracy,  two  point 
Gaussian  q  =  2  is  used,  which  has  G\  =  cPj  +  (1  —  c)P2,  G2  =  cP2  +  (1  — 
c)Pi,  c  =  |  and  uq  =  cu2  =  \  for  the  line  with  end  points  Pi  and  P2. 

We  now  give  some  test  results  about  accuracy  for  the  third  and  fourth 
order  WENO  schemes  constructed  on  triangulations  above. 

The  first  example  is  the  two-dimensional  linear  equation: 

ut  +  ux  +  uy  =  0  (7.10) 

with  the  initial  condition  uo(x,y)  —  sin (^(x  +  y)),  — 2  <  x  <  2,  —  2  <  y  <  2, 
and  periodic  boundary  conditions. 

We  first  use  uniform  triangular  meshes  which  are  obtained  by  adding  one 
diagonal  line  in  each  rectangle,  shown  in  Fig.  7.1  for  the  coarsest  case  h  =  |. 
The  accuracy  results  are  shown  for  both  the  third  order  scheme  (from  the 
combination  of  linear  polynomials)  and  the  fourth  order  scheme  (from  the 
combination  of  quadratic  polynomials),  for  both  the  linear  constant  weights 
in  Table  7.1  and  the  WENO  weights  in  Table  7.2.  Here  h  is  the  length  of  the 
rectangles.  The  results  shown  are  at  t  =  2.0.  The  errors  presented  are  those 
of  the  cell  averages  of  u. 


Fig.  7.1.  Uniform  mesh  with  h  =  |  for  the  accuracy  test. 
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Table  7.1.  Accuracy  for  the  2D  linear  equation,  uniform  meshes,  linear  schemes. 


Pl  (3rd  order) 

P2  (4th  order) 

■23 

L1  error 

L°°  error 

LA  error 

L°°  error 

EfiU 

1.80E-01 

— 

2.79E-01 

— 

1.40E-02 

— 

2.17E-02 

— 

ES3 

2.81E-02 

2.68 

4.37E-02 

2.68 

9.11E-04 

3.94 

Efi E 

3.65E-03 

2.95 

5.72E-03 

2.93 

5.57E-05 

4.03 

8.72E-05 

4.02 

H 

4.60E-04 

2.99 

7.22E-04 

2.99 

3.43E-06 

4.02 

5.39E-06 

4.02 

U 

5.76E-05 

9.05E-05 

KTiTil 

2.12E-07 

4.02 

3.34E-07 

4.01 

H 

7.21E-06 

BgliTfl 

1.13E-05 

1.32E-08 

4.01 

2.07E-08 

4.01 

Table  7.2.  Accuracy  for  the  2D  linear  equation,  uniform  meshes,  WENO  schemes. 


P 1  (3rd  order) 

P 1  (4th  order) 

■23 

L°°  error 

order 

LL  error 

order 

L°°  error 

2.66E-01 

— 

4.30E-01 

— 

1.38E-02 

— 

2.94E-02 

— 

KBU 

8.11E-02 

1.71 

1.93E-01 

1.16 

2.94 

2.74E-03 

3.42 

EHQ 

2.65E-02 

1.62 

6.16E-02 

1.65 

8.87E-05 

4.34 

1.46E-04 

4.23 

urn 

2.68E-03 

3.31 

8.77E-03 

2.81 

4.34E-06 

4.35 

7.11E-06 

4.36 

OKI 

1.44E-04 

4.22 

4.88E-04 

4.17 

2.30E-07 

4.24 

3.71E-07 

4.26 

EK2D 

8.05E-06 

4.16 

2.40E-05 

4.35 

1.34E-08 

4.10 

2.12E-08 

4.13 

We  then  use  non-uniform  meshes,  shown  in  Fig.  7.2  for  the  coarsest  case 
h  —  |,  where  h  is  just  an  average  mesh  size.  The  refinement  of  the  meshes  is 
done  in  a  uniform  way,  namely  by  cutting  each  triangle  into  4  smaller  similar 
ones.  The  accuracy  result  is  shown  in  Table  7.3  for  the  linear  constant  weights 
case  and  in  Table  7.4  for  the  WENO  case. 


The  second  example  is  the  two-dimensional  Burgers’  equation: 


ut  + 


=  0 


(7.11) 


with  the  initial  condition  uo(x,y)  =  0.3  +  0.7  sin(|(a:  +  y)),  —2<x< 

2,  -2  <  y  <  2,  and  periodic  boundary  conditions. 

We  first  use  the  same  uniform  triangular  meshes  as  in  the  previous  exam¬ 
ple,  shown  in  Fig.  7.1  for  the  coarsest  case  h  =  §.  In  Table  7.5,  the  accuracy 
results  for  the  linear  schemes  are  shown  for  both  the  third  order  scheme  and 
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Fig.  7.2.  Non-uniform  mesh  with  h  =  |  for  accuracy  test. 

Table  7.3.  Accuracy  for  the  2D  linear  equation,  non-uniform  meshes,  linear 
schemes. 


P1  (3rd  order) 

Pi  (4th  order) 

L1  error 

L°°  error 

2523 

Ll  error 

L°°  error 

1.21E-01 

— 

2.25E-01 

— 

4.95E-03 

— 

1.73E-02 

— 

1.81E-02 

2.74 

3.74E-02 

2.59 

2.90E-04 

4.09 

1.42E-03 

3.61 

2.36E-03 

2.94 

5.39E-03 

■Will 

2.21E-05 

3.71 

8.32E-05 

MfeilM 

ho/16 

3.00E-04 

2.98 

7.19E-04 

2.91 

1.29E-06 

4.10 

ho/32 

3.78E-05 

2.99 

9.40E-05 

2.94 

7.76E-08 

3.16E-07 

4.01 

ho/64 

4.75E-06 

2.99 

1.22E-05 

2.95 

4.75E-09 

4.03 

1.95E-08 

4.02 

Table  7.4.  Accuracy  for  the  2D  linear  equation,  non-uniform  meshes,  WENO 
schemes. 


P1  (3rd  order) 

Pl  (4th  order)  j 

ms 

L°°  error 

Ll  error 

L°°  error 

25J2I 

2.79E-01 

— 

5.28E-01 

— 

1.77E-02 

— 

6.41E-02 

8.43E-02 

1.73 

EKsrarrn 

1.19 

8.85E-04 

4.32 

3.07E-03 

4.38 

2.53E-02 

1.74 

7.47E-02 

1.64 

1.43E-04 

4.42 

ho/16 

2.24E-03 

3.50 

1.14E-02 

2.71 

6.37E-06 

4.49 

ho/32 

1.18E-04 

4.25 

6.83E-04 

3.36E-07 

4.25 

ho/64 

6.21E-06 

4.25 

Mm i 

4.92E-09 

4.19 

2.00E-08 

4.07 
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the  fourth  order  scheme,  at  t  =  0.5/tt2  when  the  solution  is  still  smooth.  The 
errors  presented  are  those  of  the  point  values  at  the  6  quadrature  points  of 
each  triangle.  In  Table  7.6,  the  same  accuracy  results  for  the  WENO  schemes 
are  shown. 


Table  7.5.  Accuracy  for  2D  Burgers’  equation,  uniform  meshes,  linear  schemes. 


PL  (3rd  order) 

P 1  (4th  order) 

m 

L1  error 

L°°  error 

LL  error 

order 

L°°  error 

El 

2.67E-02 

— 

7.75E-02 

— 

8.63E-03 

— 

2.18E-02 

----- 

in 

3.65E-03 

2.87 

1.16E-02 

2.74 

6.08E-04 

3.68 

ESQ 

4.60E-04 

2.93 

3.97E-05 

3.94 

ESQ 

5.75E-05 

BHiTil 

1.91E-04 

ESQ 

2.51E-06 

3.98 

7.37E-06 

3.98 

ESQ 

3.01 

2.38E-05 

1.57E-07 

ESQ 

4.62E-07 

ESQ 

EMI 

ESQ 

2.97E-06 

Kill 

9.83E-09 

2.89E-08 

ESQ 

Table  7.6.  Accuracy  for  2D  Burgers’  equation,  uniform  meshes,  WENO  schemes. 


P 1  (3rd  order) 

P 1  (4th  order) 

KJ 

L 1  error 

L°°  error 

LL  error 

L°°  error 

El 

2.76E-02 

— ■ 

8.18E-02 

— 

8.64E-03 

— 

2.106-02 

— 

EE 

4.63E-03 

2.58 

1.20E-02 

wMriri 

6.05E-04 

3.84 

1.73E-03 

3.60 

ESQ 

6.97E-04 

2.73 

3.94E-05 

3.94 

1.18E-04 

3.87 

ESQ 

7.12E-05 

3.29 

1.90E-04 

3.51 

2.50E-06 

3.98 

ESQ 

7.63E-06 

3.22 

2.36E-05 

3.01 

1.57E-07 

3.99 

4.63E-07 

ESQ 

ESQ 

9.08E-07 

3.07 

2.96E-06 

esq 

9.83E-09 

esq 

2.89E-08 

esq 

We  then  use  the  same  non-uniform  meshes  as  in  the  previous  example, 
shown  in  Fig.  7.2  for  the  coarsest  case.  The  accuracy  result  is  shown  in  Table 
7.7  for  the  linear  constant  weights  case  and  in  Table  7.8  for  the  WENO  case. 


To  demonstrate  the  application  for  shock  computation,  we  continue  the 
the  WENO  calculation  to  t  =  5/n2  when  discontinuities  develop.  Fig.  7.3  is 
the  result  for  h  =  1/20  of  a  uniform  mesh.  Fig.  7.4  is  the  result  for  h  —  ho/16 
of  a  non-uniform  mesh.  We  can  see  that  the  shock  transitions  are  sharp  and 
non-oscillatory. 


High  Order  ENO  and  WENO  Schemes  for  CFD  501 


Table  7.7.  Accuracy  for  2D  Burgers’  equation,  non-uniform  meshes,  linear  schemes. 


P1  (3rd  order) 

P2  (4th  order) 

h 

llJIJgJ 

L°°  error 

L 1  error 

L°°  error 

ho/2 

1.69E-02 

— 

7.95E-01 

— 

3.96E-03 

— 

1.88E-02 

ho/4 

2.23E-03 

2.92 

1.23E-02 

2.69 

2.87E-04 

3.79 

2.17E-03 

h0/8 

2.84E-04 

2.97 

1.69E-03 

2.86 

1.90E-05 

ho/16 

3.57E-05 

2.99 

2.22E-04 

2.93 

1.20E-06 

3.99 

1.34E-05 

3.77 

ho/32 

4.48E-06 

2.99 

£2i£J 

7.57E-08 

3.99 

1.00E-06 

3.74 

ho/64 

5.63E-07 

2.99 

4.26E-06 

2.82 

4.75E-09 

ES23 

7.57E-08 

3.72 

Table  7.8.  Accuracy  for  2D  Burgers’  equation,  non-uniform  meshes,  WENO 
schemes. 


P1  (3rd  order) 

P2  (4th  order) 

LL  error 

L°°  error 

Lx  error 

L°°  error 

2.01E-02 

— 

9.16E-02 

— 

4.18E-03 

2.376-02 

— 

3.85E-03 

2.38 

1.80E-02 

2.35 

2.90E-04 

3.18 

5.79E-04 

2.73 

3.39E-03 

2.41 

1.85E-05 

KKSl 

ho/16 

5.34E-05 

3.44 

3.55E-04 

3.26 

1.18E-06 

ho/32 

5.12E-06 

3.38 

7.45E-08 

3.99 

9.99E-07 

3.76 

ho/64 

5.82E-07 

3.14 

4.23E-06 

4.67E-09 

7.56E-08 

3.72 

3rd  order,  uniform  mesh 


4th  order,  uniform  mesh 


Fig.  7.3.  2D  Burgers’  equation:  t  =  5/n2,  uniform  mesh.  Left:  third  order  WENO; 
Right:  fourth  order  WENO. 
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3rd  order,  non-uniform  mesh  4th  order,  non-uniform  mesh 


Fig.  7.4.  2D  Burgers’  equation:  t  =  5/tt2,  non-uniform  mesh.  Left:  third  order 
WENO;  Right:  fourth  order  WENO. 


7.2  Finite  Difference  Formulation  in  the  Scalar  Case 


Here  we  assume  a  uniform  grid  and  solve  (7.1)  directly  using  a  conservative 
approximation  to  the  spatial  derivative: 


(7.12) 


where  Uij{t)  is  the  numerical  approximation  to  the  point  value  u(xi,yj,t). 

The  numerical  flux  fi+U  is  obtained  by  the  one  dimensional  ENO  or 
WENO  approximation  procedure,  Algorithm  3. lor  3.2, with  v(x)=f(u(x,  Vj,t)) 
and  with  j  fixed.  Likewise,  the  numerical  flux  9ij+i  is  obtained  by  the 
one  dimensional  ENO  or  WENO  approximation  procedure,  with  v(y)  = 
f(u(xi,y,t))  and  with  i  fixed. 

All  the  one  dimensional  discussions  in  Sect.  4.2,  such  as  upwinding,  ENO- 
Roe,  flux  splitting,  etc.,  can  be  applied  here  dimension  by  dimension. 

The  discussion  here  is  also  valid  for  higher  spatial  dimension  n.  In  effect, 
it  is  the  same  one  dimensional  conservative  derivative  approximation  applied 
to  each  space  dimension. 

It  is  a  straight  forward  exercise  [16]  to  show  that,  in  terms  of  operation 
count,  the  finite  difference  ENO  or  WENO  schemes  are  about  a  factor  of  4 
less  than  the  finite  volume  counterpart  of  the  same  order.  In  3D  this  factor 
becomes  about  9. 

We  thus  strongly  recommend  the  usage  of  the  finite  difference  version  of 
ENO  and  WENO  schemes  (also  called  ENO  and  WENO  schemes  based  on 
point  values),  whenever  possible. 
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7.3  Provable  Properties  in  the  Scalar  Case 

Second  order  ENO  schemes  are  also  maximum  norm  non-increasing  for  multi¬ 
dimensions.  Of  course,  this  stability  is  too  weak  to  imply  any  convergence.  As 
was  mentioned  before,  there  is  no  known  convergence  result  for  ENO  schemes 
of  order  higher  than  2,  even  for  smooth  solutions. 

WENO  schemes  have  better  convergence  results  also  in  the  current  multi- 
D  case,  mainly  because  their  numerical  fluxes  are  smoother.  It  is  proven  [55] 
that  WENO  schemes  converge  for  smooth  solutions. 

We  again  emphasize  that,  even  though  there  are  very  few  theoretical  re¬ 
sults  about  ENO  or  WENO  schemes,  in  practice  they  are  very  robust  and 
stable.  We  once  again  caution  against  any  attempts  to  modify  the  schemes 
solely  for  the  purpose  of  stability  or  convergence  proofs.  In  fact  the  modifica¬ 
tion  of  ENO  schemes  in  [89],  presented  in  Sect.  4.3,  which  keeps  the  formal 
uniform  high  order  accuracy,  actually  produces  schemes  which  are  convergent 
to  entropy  solutions  for  general  multi  dimensional  scalar  equations.  However 
it  was  pointed  out  there  that  the  modification  is  not  computationally  useful, 
hence  the  convergence  result  has  little  practical  value. 


7.4  Systems 

The  advice  here  is  that,  when  the  fluxes  are  computed  along  a  cell  boundary, 
a  one  dimensional  local  characteristic  decomposition  normal  to  the  boundary 
is  performed.  Also,  the  monotone  flux  is  replaced  with  a  one  dimensional 
exact  or  approximate  Riemann  solver.  Thus,  the  discussion  in  Sect.  4.4  can 
be  applied  here.  For  second  and  some  third  order  schemes,  a  componentwise 
ENO  or  WENO  scheme  usually  gives  satisfactory  results  for  most  test  prob¬ 
lems,  with  a  significantly  lower  computational  cost  than  the  characteristic 
decompositions. 

There  are  discussions  in  the  literature  about  truly  multi-dimensional  recipes. 
However,  these  tend  to  become  extremely  complicated  for  order  of  accuracy 
higher  than  two,  so  they  have  not  been  used  extensively  in  practice  for  higher 
order  schemes.  Another  reason  to  suggest  against  using  such  complicated 
truly  multidimensional  recipes  for  order  of  accuracy  higher  than  two  is  that, 
while  dimension  by  dimension  schemes  as  advocated  in  these  lecture  notes 
are  not  rotationally  invariant,  the  direction  related  non-symmetry  actually 
diminishes  with  increased  order  [16]. 

8  Further  Topics  in  ENO  and  WENO  Schemes 

In  this  section  we  discuss  some  miscellaneous  (but  not  necessarily  unimpor¬ 
tant!)  topics  in  ENO  and  WENO  schemes. 
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8.1  SubceF  Resolution 

This  idea  was  first  raised  by  Harten  [44].  The  observation  is  that,  since  in 
interpolating  the  primitive  V,  two  points  must  be  included  in  the  initial 
stencil  (see  Algorithm  3.1),  one  cannot  avoid  having  at  least  one  cell  for  each 
discontinuity,  inside  which  the  reconstructed  polynomial  is  not  accurate  (0(1) 
error  there).  We  can  clearly  see  this  0(1)  error  in  the  ENO  interpolation  in 
Fig.  3.1.  The  reconstruction  in  this  shocked  cell,  although  inaccurate,  will 
always  be  monotone  (Property  2  in  Sect.  3.1),  so  stability  will  not  be  a 
problem.  However,  it  does  cause  a  smearing  of  the  discontinuity  (over  one 
cell,  initially). 

If  we  are  solving  a  truly  nonlinear  shock,  then  characteristics  flow  into  the 
shock,  thus  any  error  one  makes  during  time  evolution  tends  to  be  absorbed 
into  the  shock  (we  also  say  that  the  shock  has  a  self  sharpening  mechanism) . 
However,  we  are  less  lucky  with  a  linear  discontinuity,  such  as  a  discontinuity 
carried  by  the  linear  equation  ut  +  ux  =  0.  Such  linear  discontinuities  are  also 
called  contact  discontinuities  in  gas  dynamics.  The  characteristics  for  such 
cases  are  parallel  to  the  discontinuity,  hence  any  numerical  smearing  tends  to 
accumulate  and  the  discontinuity  becomes  progressively  more  smeared  with 
time.  Harten  argues  that  the  smearing  of  the  discontinuity  is  at  the  rate 
of  0(Ax 1_Mrr)  where  k  is  the  order  of  the  scheme.  Although  higher  order 
schemes  have  less  smearing,  when  time  is  large  the  smearing  is  still  very 
significant. 

Harten  [44]  makes  the  following  simple  observation:  in  the  shocked  cell 
Ii,  instead  of  using  the  reconstruction  polynomial  Pi{x),  which  is  highly  in¬ 
accurate  (the  only  useful  information  it  carries  is  the  cell  average  in  the  cell), 
one  could  try  to  find  the  location  of  the  discontinuity  inside  the  cell  say 
at  xs,  and  then  use  the  neighboring  reconstructions  Pi-i(x)  extended  to  xs 
from  left  and  pi+i(x)  extended  to  xs  from  right.  To  find  the  shock  location, 
one  could  argue  that  pi-i(x)  is  a  very  accurate  approximation  to  v(x)  up  to 
the  discontinuity  xs  from  left,  and  pi+i(x)  is  a  very  accurate  approximation 
to  v(x)  up  to  the  discontinuity  xs  from  right.  We  thus  extend  Pi~i{x)  from 
the  left  into  the  cell  Ii,  and  extend  Pi+i{x)  from  the  right  into  the  cell 
and  require  that  the  cell  average  Vi  be  preserved: 

fX“  fXi+  £ 

/  pi^i(x)dx+  /  pi+i(x)dx  =  AxiVi.  (8.1) 

Jx._ J  JXS 

It  can  be  proven  that  under  very  general  conditions,  (8.1)  has  only  one  root 
xs  inside  the  cell  Ii,  hence  one  could  use  Newton  iterations  to  find  this  root. 

Subcell  resolution  can  be  applied  to  both  finite  volume  and  finite  difference 
ENO  and  WENO  schemes  [44],  [90],  However,  it  should  be  applied  only  to 
sharpen  contact  discontinuities.  It  is  quite  dangerous  to  apply  the  subcell 
resolution  to  a  shock,  since  it  might  generate  entropy  violating  expansion 
shocks  in  the  numerical  solution. 
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Another  very  serious  restriction  about  subcell  resolution  is  that  it  is  very 
difficult  to  be  applied  to  2D.  However,  see  Siddiqi,  Kimia  and  Shu  [93],  where 
a  geometrical  ENO  is  used  to  extend  the  subcell  resolution  idea  to  2D  for 
image  processing  problems  (we  termed  it  geometric  ENO,  or  GENO). 

8.2  Artificial  Compression 

Another  very  useful  idea  to  sharpen  a  contact  discontinuity  is  the  artificial 
compression,  first  developed  by  Harten  [41]  and  further  improved  by  Yang 
[105].  The  idea  is  to  increase  the  magnitude  of  the  slope  of  a  reconstruction,  of 
course  subject  to  certain  monotonicity  restrictions,  near  such  a  discontinuity. 
Notice  that  this  goes  against  the  idea  of  limiting,  which  typically  decreases 
the  magnitude  of  the  slope  of  a  reconstruction. 

Artificial  compression  can  be  applied  both  to  finite  volume  and  to  finite 
difference  ENO  and  WENO  schemes  [105],  [90],  [55].  Unlike  subcell  resolution, 
artificial  compression  can  also  be  applied  easily  to  multi  space  dimensions, 
at  least  in  principle. 

8.3  Other  Building  Blocks 

It  is  not  necessary  to  stay  within  polynomial  building  blocks,  although  poly¬ 
nomials  are  the  most  natural  functions  to  work  with.  For  some  applications, 
other  building  blocks,  such  as  rational  functions,  trigonometric  polynomials, 
exponential  functions,  radial  functions,  etc.,  may  be  more  appropriate.  The 
idea  of  ENO  or  WENO  can  be  applied  also  in  such  situations.  The  key  idea 
is  to  find  suitable  “smooth  indicators” ,  similar  to  the  Newton  divided  differ¬ 
ences  for  the  polynomial  case,  for  applying  the  ENO  or  WENO  idea.  See  [17] 
and  [52]  for  some  examples. 

9  Time  Discretization 

Up  to  now  we  have  only  considered  spatial  discretizations,  leaving  the  time 
variable  continuous  (method  of  lines).  In  this  section  we  consider  the  issue 
of  time  discretization.  The  techniques  discussed  in  this  section  can  also  be 
applied  to  other  types  of  spatial  discretizations  using  the  method  of  lines  ap¬ 
proach,  such  as  various  TVD  and  TVB  schemes  [66,100,85]  and  discontinuous 
Galerkin  methods  [18-21], 

9.1  TVD  Runge-Kutta  Methods 

A  class  of  TVD  (total  variation  diminishing)  high  order  Runge-Kutta  meth¬ 
ods  is  developed  in  [89]  and  further  in  [36]. 

These  Runge-Kutta  methods  are  used  to  solve  a  system  of  initial  value 
problems  of  ODEs  written  as: 


ut  =  L(u) 


(9.1) 
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resulting  from  a  method  of  lines  spatial  approximation  to  a  PDE  such  as: 

ut  =  -f(u)x.  (9.2) 

We  have  written  the  equation  in  (9.2)  as  a  ID  conservation  law,  but  the 
discussion  which  follows  apply  to  general  initial  value  problems  of  PDEs  in 
any  spatial  dimensions.  Clearly,  L(u )  in  (9.1)  is  an  approximation  (e.g.  ENO 
or  WENO  approximation  in  these  lecture  notes),  to  the  derivative  —f(u)x  in 
the  PDE  (9.2). 

If  we  assume  that  a  first  order  Euler  forward  time  stepping: 

un+1  =  un  +  AtL(un)  (9.3) 

is  stable  in  a  certain  norm: 

IK+1II  <  IKII  (9-4) 

under  a  suitable  restriction  on  At: 

At  <  Ati,  (9.5) 

then  we  look  for  higher  order  in  time  Runge-Kutta  methods  such  that  the 
same  stability  result  (9.4)  holds,  under  a  perhaps  different  restriction  on  At: 

At<cAtx.  (9.6) 

where  c  is  termed  the  CFL  coefficient  for  the  high  order  time  discretization. 

We  remark  that  the  stability  condition  (9.4)  for  the  first  order  Euler 
forward  in  time  (9.3)  is  easy  to  obtain  in  many  cases,  such  as  various  TVD 
and  TVB  schemes  in  ID  (where  the  norm  is  the  total  variation  norm)  and  in 
multi  dimensions  (where  the  norm  is  the  L°°  norm),  see,  e.g.  [66,100,85]. 

Originally  in  [89,86]  the  norm  in  (9.4)  was  chosen  to  be  the  total  variation 
norm,  hence  the  terminology  “TVD  time  discretization”. 

As  it  stands,  the  TVD  high  order  time  discretization  defined  above  main¬ 
tains  stability  in  whatever  norm,  of  the  Euler  forward  first  order  time  step¬ 
ping,  for  the  high  order  time  discretization,  under  the  time  step  restriction 
(9.6).  For  example,  if  it  is  used  for  multi  dimensional  scalar  conservation  laws, 
for  which  TVD  is  not  possible  but  maximum  norm  stability  can  be  maintained 
for  high  order  spatial  discretizations  plus  forward  Euler  time  stepping  (e.g. 
[20]),  then  the  same  maximum  norm  stability  can  be  maintained  if  TVD  high 
order  time  discretization  is  used.  As  another  example,  if  an  entropy  inequal¬ 
ity  can  be  proved  for  the  Euler  forward,  then  the  same  entropy  inequality  is 
valid  under  a  high  order  TVD  time  discretization. 

In  [89],  a  general  Runge-Kutta  method  for  (9.1)  is  written  in  the  form: 

i— 1 

(aikuW  +  AtfaL{uWj)  , 

k=0 

u(0)  =  un,  vSm)  =  un+l . 


1  =  1,  ...,m 


(9.7) 
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Clearly,  if  all  the  coefficients  are  nonnegative  a,*  >  0,  flik  >  0,  then  (9.7)  is 
just  a  convex  combination  of  the  Euler  forward  operators,  with  At  replaced 
by  At ,  since  by  consistency  YlT=o  =  1-  We  thus  have 


Lemma  9.1.  [89]  The  Runge-Kutta  method  (9.7)  is  TVD  under  the  CFL 
coefficient  (9.6): 


c  = 


.  &ik 
mm  — — , 

*.fe  pik 


(9.8) 


provided  that  a»*  >  0,  fin-  >  0. 


□ 


In  [89],  schemes  up  to  third  order  were  found  to  satisfy  the  conditions  in 
Lemma  9.1  with  CFL  coefficient  equal  to  1. 

The  optimal  second  order  TVD  Runge-Kutta  method  is  given  by  [89,36]: 

u(1)  =un  +  AtL(un)  (9.9) 

un+1  =  \un  +  iu*1)  +  \AtL{u^), 

z  z  z 

with  a  CFL  coefficient  c  =  1  in  (9.8). 

The  optimal  third  order  TVD  Runge-Kutta  method  is  given  by  [89,36]: 

=  un  +  AtL(un) 

u(2)  =  ^un  +  iuW  +  ^AtL(uW)  (9.10) 

un+1  =  1 AtL(u(2) ), 

u  O  O 

with  a  CFL  coefficient  c  =  1  in  (9.8). 

It  can  be  shown  that  for  any  order  of  accuracy,  c  =  1  is  the  best  one  can 
get  for  a  CFL  coefficient.  We  have  also  found,  for  a  linear  spatial  operator  L, 
optimal  TVD  Runge-Kutta  methods  for  arbitrary  order  of  accuracy  with  a 
CFL  coefficient  c  =  1.  These  results  will  appear  in  a  forthcoming  paper  [37]. 

Unfortunately,  if  L  is  nonlinear,  it  is  proven  in  [36]  that  no  four  stage, 
fourth  order  TVD  Runge-Kutta  method  exists  with  nonnegative  a ik  and  fiik . 
We  thus  have  to  consider  the  situation  where  a**  >  0  but  fiik  might  be 
negative.  In  such  situations  we  need  to  introduce  an  adjoint  operator  L.  The 
requirement  for  L  is  that  it  approximates  the  same  spatial  derivative  (s)  as 
L,  but  is  TVD  (or  stable  in  another  relevant  norm)  for  first  order  Euler, 
backward  in  time: 

un+1  =un-  AtL(un)  (9.11) 

This  can  be  achieved,  for  hyperbolic  conservation  laws,  by  solving  the  back¬ 
ward  in  time  version  of  (9.2): 


ut  =  f(u)x. 


(9.12) 
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Numerically,  the  only  difference  is  the  change  of  upwind  direction.  Clearly,  L 
can  be  computed  with  the  same  cost  as  that  of  computing  L.  We  then  have 
the  following  lemma: 


Lemma  9.2.  [89]  The  Runge-Kutta  method  (9.7)  is  TVD  under  the  CFL 
coefficient  (9.6): 


.  &ik 

c  =  mm  — — r , 
\Pik\ 


provided  that  a**  >  0,  and  L  is  replaced  by  L  for  negative  □ 


Notice  that,  if  for  the  same  k,  both  L(u W)  and  L(u (*))  must  be  computed, 
the  cost  as  well  as  storage  requirement  for  this  k  is  doubled.  For  this  reason, 
we  would  like  to  avoid  negative  as  much  as  possible. 

An  extensive  search  performed  in  [36]  gives  the  following  preferred  four 
stage,  fourth  order  TVD  Runge-Kutta  method: 


(2)  649  (0)  10890423  n,  951  (1)  5000,, 

u(  >  = - u''  > - AtL(un)  H - -I - At 

1600  25193600  V  ’  1600  7873 

f3l  53989  „  102261  n,  4806213  (1) 

u^'  =  — - un - AtL(u  )  H - 

2500000  5000000  v  ’  20000000 

5121  .  ~ .  23619  fo\  7873  .  .  /o\. 

'20000^"' ■’>  +  32000“"  +  IOOOOZl,i<“  > 

+S“,J,+r<s,+5^(“<s,> 


with  a  CFL  coefficient  c  =  0.936  in  (9.13).  Notice  that  two  L’s  must  be  com¬ 
puted.  The  effective  CFL  coefficient,  comparing  with  an  ideal  case  without 
L’s,  is  0.936  x  |  =  0.624.  Since  it  is  difficult  to  solve  the  global  optimization 
problem,  we  do  not  claim  that  (9.14)  is  the  optimal  4  stage,  4th  order  TVD 
Runge-Kutta  method. 

A  fifth  order  TVD  Runge-Kutta  method  is  also  given  in  [89]. 

For  large  scale  scientific  computing  in  three  space  dimensions,  storage  is 
usually  a  paramount  consideration.  There  are  therefore  discussions  about  low 
storage  Runge-Kutta  methods  [103],  [13],  which  only  require  2  storage  units 
per  ODE  equation.  In  [36],  we  considered  the  TVD  properties  among  such 
low  storage  Runge-Kutta  methods  and  found  third  order  low  storage  TVD 
Runge-Kutta  methods. 

The  general  low-storage  Runge-Kutta  schemes  can  be  written  in  the  form 
[103],  [13]: 

du(i)  =  Aidu{i-1]  +  AtL(u(i-1}) 

=  u(t_1)  +  BidvS%\  i  =  l,...,m 
u(0)=un,  u(m)  =  un+1,  Ao  =  0 


(9.15) 
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Only  u  and  du  must  be  stored,  resulting  in  two  storage  units  for  each  variable. 

Carpenter  and  Kennedy  [13]  have  classified  all  the  three  stage,  third  or¬ 
der  (m=3)  low  storage  Runge-Kutta  methods,  obtaining  the  following  one 
parameter  family: 


z\ 

CJ 

CO 

II 

+  36c|  —  135c2  -1-  84c2  - 

12 

zz 

—  2c2  +  c2  —  2 

Z3 

=  12c2  - 

I8C2  +  I8C2 

-  llc2  +  2 

Z4 

1 

Tt<  CN 

CJ 

CO 

CO 

II 

36c2  +  13c| 

—  8c2  +  4 

Z5 

=  69c2  - 

62c2  4-  28c2 

-8 

Z6 

=  34<4  - 

46c2  +  34c2 

—  13c2  +  2 

B 1 

=  c2 

Bn 

12c2(c 

2  -  1)(3*2  - 

zi)  -  (3 z2  - 

•*i)2 

■D2 

144c2(3c2  - 

2)(C2  -  l)2 

B , 

-24(3c2  -  2)(c2  -  l)2 

-°3 

(3,22  - 

zi)2  -  12 c2(c2  -  1){2,z2 

-Zl) 

-zi(6c^  -  4c2  +  1)  +  3z3 
(2c2  +  1  )z\  —  3(c2  +  2)(2c2  —  l)2 
-z±zi  -f  108(2c2  -  l)c2  -  3(2c2  -  1  )z5 
24zlC2(c2  -  l)4  +  72c2z6  +  72c® (2c2  -  13) 


(9.16) 


In  [36]  we  converted  this  form  into  the  form  (9.7),  by  introducing  three 
new  parameters.  Then  we  searched  for  values  of  these  parameters  that  would 
maximize  the  CFL  restriction,  by  a  computer  program.  The  result  seems  to 
indicate  that 

c2  =  0.924574  (9.17) 

gives  an  almost  best  choice,  with  CFL  coefficient  c  =  0.32  in  (9.8).  This  is  of 
course  less  optimal  than  (9.10)  in  terms  of  CFL  coefficients,  however  the  low 
storage  form  is  useful  for  large  scale  calculations. 

We  end  this  subsection  by  quoting  the  following  numerical  example  [36], 
which  shows  that,  even  with  a  very  nice  second  order  TVD  spatial  discretiza¬ 
tion,  if  the  time  discretization  is  by  a  non-TVD  but  linearly  stable  Runge- 
Kutta  method,  the  result  may  be  oscillatory.  Thus  it  would  always  be  safer 
to  use  TVD  Runge-Kutta  methods  for  hyperbolic  problems. 

The  numerical  example  uses  the  standard  minmod  based  MUSCL  second 
order  spatial  discretization  [101].  We  will  compare  the  results  of  a  TVD  versus 
a  non-TVD  second  order  Runge-Kutta  time  discretizations.  The  PDE  is  the 
simple  Burgers  equation 


1 
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with  a  Riemann  initial  data: 

U(X’  =  {  -0.5,  if  x  >  0.  (9-19) 

The  nonlinear  flux  {\u2)x  in  (9.18)  is  approximated  by  the  conservative 
difference 

Ax  (-^+1  ~£-|)  ’ 

where  the  numerical  flux  /«■*  is  defined  by 

fi+i  =  h(u7+i,u++,) 

with 

u~  j  =  U{  +  \minmod(ui+i  —  Ui,U{  —  itj_ i), 

1+  2  2 

W'il  —  Uj+i  ~~7Tli7lTYlO(l(Ui^-2  u  j ) 

1+2  2 


The  monotone  flux  /i  is  the  Godunov  flux  defined  by  (4.6),  and  the  minmod 
function  is  given  by 

minmod(a,b)  —  +  signify  mjn(|a|;  |fo|). 

z 

It  is  easy  to  prove,  by  using  Harten’s  Lemma  [42],  that  the  Euler  forward 
time  discretization  with  this  second  order  MUSCL  spatial  operator  is  TVD 
under  the  CFL  condition  (9.5): 


At  < 


Ax 

2maxj  |u"| 


(9.20) 


Thus  At  =  2  maxj'lii™  |  used  in  all  our  calculations.  Actually,  apart 

from  a  slight  difference  (the  minmod  function  is  replaced  by  a  minimum-in- 
absolute- value  function),  this  MUSCL  scheme  is  the  same  as  the  second  order 
ENO  scheme  discussed  in  Sect.  4.1. 

The  TVD  second  order  Runge-Kutta  method  we  consider  is  the  optimal 
one  (9.9).  The  non-TVD  method  we  use  is: 

u(1)  =un  -  20 AtL(un)  (9.21) 

un+1  =un  +  ^ AtL(un )  -  ^AtL(uW). 

It  is  easy  to  verify  that  both  methods  are  second  order  accurate  in  time. 
The  second  one  (9.21)  is  however  clearly  non-TVD,  since  it  has  negative 
/3’s  in  both  stages  (i.e.  it  partially  simulates  backward  in  time  with  wrong 
upwinding) . 
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If  the  operator  L  is  linear  (for  example  the  first  order  upwind  scheme 
applied  to  a  linear  PDE),  then  both  Runge-Kutta  methods  (actually  all  the 
two  stage,  second  order  Runge-Kutta  methods)  yield  identical  results  (the 
two  stage,  second  order  Runge-Kutta  method  for  a  linear  ODE  is  unique). 
However,  since  our  L  is  nonlinear,  we  may  and  do  observe  different  results 
when  the  two  Runge-Kutta  methods  are  used. 

In  Fig.  9.1  we  show  the  result  of  the  TVD  Runge-Kutta  method  (9.9)  and 
the  non-TVD  method  (9.21),  after  the  shock  moves  about  50  grids  (400  time 
steps  for  the  TVD  method,  528  time  steps  for  the  non-TVD  method).  We 
can  clearly  see  that  the  non-TVD  result  is  oscillatory  (there  is  an  overshoot). 


Fig.  9.1.  Second  order  TVD  MUSCL  spatial  discretization.  Solution  after  the  shock 
moves  50  grids.  Left:  with  TVD  time  discretization  (9.9);  Right:  with  non-TVD  time 
discretization  (9.21). 


Such  oscillations  are  also  observed  when  the  non-TVD  Runge-Kutta  method 
coupled  with  a  second  order  TVD  MUSCL  spatial  discretization  is  applied  to 
a  linear  PDE  ( ut  +ux  =0)  (the  scheme  is  still  nonlinear  due  to  the  minmod 
functions).  Moreover,  for  some  Runge-Kutta  methods,  if  one  looks  at  the  in¬ 
termediate  stages,  i.e.  for  1  <  i  <  m  in  (9.7),  one  observes  even  bigger 
oscillations.  Such  oscillations  may  render  difficulties  when  physical  problems 
are  solved,  such  as  the  appearance  of  negative  density  and  pressure  for  Euler 
equations  of  gas  dynamics.  On  the  other  hand,  TVD  Runge-Kutta  method 
guarantees  that  each  middle  stage  solution  is  also  TVD. 

This  simple  numerical  test  convinces  us  that  it  is  much  safer  to  use  a 
TVD  Runge-Kutta  method  for  solving  hyperbolic  problems. 

9.2  TVD  Multi-Step  Methods 

If  one  prefers  multi-step  methods  rather  than  Runge-Kutta  methods,  one  can 
use  the  TVD  high  order  multi-step  methods  developed  in  [86].  The  philosophy 
is  very  similar  to  the  TVD  Runge-Kutta  methods  discussed  in  the  previous 
subsection.  One  starts  with  a  method  of  lines  approximation  (9.1)  to  the 
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PDE  (9.2),  and  an  assumption  that  the  first  order  Euler  forward  in  time 
discretization  (9.3)  is  stable  under  a  certain  norm  (9.4),  with  the  time  step 
restriction  (9.5).  One  then  looks  for  higher  order  in  time  multi-step  methods 
such  that  the  same  stability  result  (9.4)  holds,  under  a  perhaps  different 
restriction  on  At  in  (9.6),  where  c  is  again  termed  the  GFL  coefficient  for  the 
high  order  time  discretization. 

The  general  form  of  the  multi-step  methods  studied  in  [86]  is: 

m 

un+i  =  £  (akUn-k  +  A0kL(u »-*)) ,  (9.22) 

fc=0 

Similar  to  the  Runge-Kutta  methods  in  the  previous  subsection,  if  all  the 
coefficients  are  nonnegative  ak  >  0 ,  &k  >  0,  then  (9.22)  is  just  a  convex 
combination  of  the  Euler  forward  operators,  with  At  replaced  by  At ,  since 
by  consistency  J2T=o  =  1.  We  thus  have 


Lemma  9.3.  [86]  The  multi-step  method  (9.22)  is  TVD  under  the  CFL  co¬ 
efficient  (9.6): 


.  ak 

c  =  nT&’ 


(9.23) 


provided  that  ak  >  0,  f3k  >  0. 


□ 


In  [86],  schemes  up  to  third  order  were  found  to  satisfy  the  conditions  in 
Lemma  9.3.  Here  we  list  a  few  examples. 

The  following  three  step  (m  =  2)  scheme  is  second  order  and  TVD 

«n+1  =  ^un  +  1 AtL(un )  +  iu"-2  (9.24) 

with  a  CFL  coefficient  c  =  0.5  in  (9.23).  This  translates  to  the  same  efficiency 
as  the  optimal  second  order  TVD  Runge-Kutta  scheme  (9.9),  as  here  only  one 
residue  evaluation  is  needed  per  time  step.  Of  course,  the  storage  requirement 
is  bigger  here.  There  is  also  the  problem  of  the  starting  values  u1  and  u2. 
The  following  five  step  (m  =  4)  scheme  is  third  order  and  TVD 

25  25  75 

=  _un  +  -AtL^n)  +  _u"-4  +  -AtL^)  (9.25) 

with  a  CFL  coefficient  c  =  0.5  in  (9.23).  This  translates  to  a  better  efficiency 
than  the  optimal  third  order  TVD  Runge-Kutta  scheme  (9.10),  as  here  only 
one  residue  evaluation  is  needed  per  time  step.  Of  course,  the  storage  require¬ 
ment  is  much  bigger  here.  There  is  also  the  problem  of  the  starting  values 
u1,  it2,  u3  and  u4. 

There  are  many  other  TVD  multi-step  methods  satisfying  the  conditions 
in  Lemma  9.3  listed  in  [86].  It  seems  that  if  one  uses  more  storage  (larger  m) 
one  could  get  better  CFL  coefficients. 
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In  [86]  we  have  been  unable  to  find  multi-step  schemes  of  order  four  or 
higher  satisfying  the  condition  of  Lemma  9.3.  As  in  the  Runge-Kutta  case, 
we  can  relax  the  condition  fa  >  0  by  introducing  the  adjoint  operator  L.  We 
thus  have 


Lemma  9.4.  [86]  The  multi-step  method  (9.22)  is  TVD  under  the  CFL  co¬ 
efficient  (9.6): 

c=nrjitr  <9-26> 

provided  that  a*  >  0,  and  L  is  replaced  by  L  for  negative  fa.  □ 


Again,  notice  that,  if  we  have  both  positive  and  negative  fa's,  then  both 
L(un)  and  L(un)  must  be  computed,  the  cost  as  well  as  storage  requirement 
will  thus  be  doubled. 

We  list  here  a  six  step  (m  =  5),  fourth  order  multi-step  method  which  is 
TVD  with  a  CFL  coefficient  c  =  0.245  in  (9.23)  [86]: 


n+1  _ 


^  +  ||4UW  +  |^  +  fg4U*f 


n— 4\ 


l 


H - it 

10 


n— 5 


-  lAtL(un-5) 


(9.27) 


9.3  The  Lax-WendrofF  Procedure 

Another  way  to  discretize  the  time  variable  is  by  the  Lax-Wendroff  procedure 
[65].  This  is  also  referred  to  as  the  Taylor  series  method  for  discretizing  the 
ODE  (9.1).  We  will  again  use  the  simple  ID  scalar  conservation  law  (9.2)  as 
an  example  to  illustrate  the  procedure,  however  it  applies  to  more  general 
multidimensional  systems. 

Starting  from  a  Taylor  series  expansion  in  time: 

At2 

u(x,t  +  At)  =  u(x,t)  4-  ut(x,t)At  +  utt(x,t)——  +  ...  (9.28) 

z 

The  expansion  is  carried  out  to  the  desired  order  of  accuracy  in  time.  For 
example,  a  second  order  in  time  would  need  the  three  terms  written  out  in 
(9.28).  We  then  use  the  PDE  (9.2)  to  replace  the  time  derivatives  by  the 
spatial  derivatives: 

ut(x,t)  =  -f(u(x,t))x  =  - f(u(x,t))ux{x,t ); 
utt(x,t)  =  -{f{u(x,t))tx 

=  ~(f(u(x,  t)  ut(x,  t))x  (9.29) 

=  {{f{u(x,t))2ux(x,t))x 

-  2  f(u{x,t))  f"(u(x,t)  (ux(x,t))2  +  {f'(u(x,t)))2  uxx{x,t); 
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This  little  exercise  in  (9.29)  should  convince  us  that  it  is  always  possible 
to  write  all  the  time  derivatives  as  functions  of  the  u(x,t)  and  its  spatial 
derivatives.  But  the  expression  could  be  terribly  complicated,  especially  for 
multidimensional  systems. 

Once  this  is  done,  we  substitute  (9.29)  into  (9.28),  and  then  discretize 
the  spatial  derivatives  of  u(x,  t)  by  whatever  methods  we  use.  For  example, 
in  the  cell  averaged  (finite  volume)  ENO  schemes  discussed  in  Sect.  4.1,  we 
proceed  as  follows.  We  first  integrate  the  PDE  (4.1)  in  space-time  over  the 
region  [xt_i,xi+i]  x  [tn,tn+1]  to  obtain 

f(u(xi+i,t))dt  - 

Then,  we  use  a  suitable  Gaussian  quadrature  to  discretize  the  time  integration 
for  the  flux  in  (9.30): 


f{u(xi_x,t))dt  ]  (9.30) 


f(u(xi+L,t))dt  « ^Wa/(u(*i+i,tn  +  PaAt),  (9.31) 


where  /3a  and  wa  are  Gaussian  quadrature  nodes  and  weights.  Next  we  replace 
each 

f(u(xi+i,tn  +  paAt) 

by  a  monotone  flux: 

f{u(xi+i,tn  +  paAt)  *  h(u{x~+^,tn  +  f3aAt),u(x++^,tn  +  ^aAt)J  , 

(9.32) 

and  use  the  Lax-Wendroff  procedure  (9.28)-(9.29)  to  convert 

U(xf  1,tn  +  Pa  At) 
ir2 

to  u(x±  i ,  tn)  and  its  spatial  derivatives  also  at  tn,  which  can  then  be  ob- 
2  '  2 

tained  by  the  reconstructions  p{x)  inside  i)  and  Ii+i .  Notice  that  the  accuracy 
is  just  enough  in  this  procedure,  as  each  derivative  of  the  reconstruction  p(x) 
will  be  one  order  lower  in  accuracy,  but  this  is  compensated  by  the  At  in 
front  of  it  in  (9.28). 

This  Lax-Wendroff  procedure,  comparing  with  the  method  of  lines  ap¬ 
proach  coupled  with  TVD  Runge-Kutta  or  multi-step  time  discretizations, 
has  the  following  advantages  and  disadvantages. 


Advantages: 

1.  This  is  a  truly  one  step  method,  hence  it  is  quite  compact  (a  second  order 
method  in  space  and  time  uses  only  three  cells  on  time  level  n  to  advance 
to  time  level  n  +  1  for  one  cell),  and  there  are  no  complications  such  as 
boundary  conditions  needed  in  middle  stages; 
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2.  It  utilizes  the  PDE  more  extensively  than  the  method  of  lines  approach. 
This  is  also  one  reason  that  it  can  be  so  compact. 

Disadvantages: 

1.  The  algebra  is  very,  very  complicated  for  multi  dimensional  systems.  This 
also  increases  operation  counts  for  complicated  nonlinear  systems; 

2.  It  is  more  difficult  to  prove  stability  properties  (e.g.  TVD)  for  higher 
order  methods  in  this  framework; 

3.  It  is  difficult  and  costly  to  apply  this  procedure  to  the  conservative  finite 
difference  framework  established  in  Sections  4.2  and  7.2. 

10  Formulation  of  the  ENO  and  WENO  Schemes  for 
the  Hamilton- Jacobi  Equations 

In  this  section  we  describe  high  order  ENO  and  WENO  approximations  to 
the  Hamilton- Jacobi  equation: 

fa  +  H(<px, <t>y)  =  0  nn  ii 

ct>(x,y,0)  =  (j>0(x,y) 

where  H  is  a  locally  Lipschitz  continuous  Hamiltonian  and  the  initial  con¬ 
dition  (f>°(x,y)  is  locally  Lipschitz  continuous.  We  have  written  the  equation 
(10.1)  in  two  space  dimensions,  but  the  discussion  is  valid  for  other  space 
dimensions  as  well. 

As  is  well  known,  solutions  to  (10.1)  are  Lipschitz  continuous  but  may 
have  discontinuous  derivatives,  regardless  of  the  smoothness  of  <p°{x,y ).  The 
non-uniqueness  of  such  generalized  solutions  also  necessitates  the  definition 
of  viscosity  solutions,  to  single  out  a  unique,  practically  relevant  solution. 
The  viscosity  solution  to  (10.1)  is  a  locally  Lipschitz  continuous  function 
4>(x,y,t),  which  satisfies  the  initial  condition  and  the  following  property:  for 
any  smooth  function  ip(x,y,t),  if  {xo,yo,to)  is  a  local  maximum  point  of 
(p  —  ip,  then 


ipt(x o,yo,t0)  +  H{ipx(x0,yo,to)  +  ipy(xo,yo,to))  <  0, 

and,  if  ( xo,yo,to )  is  a  local  minimum  point  oi  (p  -  ip,  then 

ipt(xo,yo,to)  +  H(ipx(x0,y0,t0)  +fpy(xo,yo,t0))  >  0. 

Of  course,  the  above  definition  means  that  whenever  <p{x,  y,  t)  is  differen¬ 
tiable,  (10.1)  is  satisfied  in  the  classical  sense.  Viscosity  solution  defined  this 
way  exists  and  is  unique.  For  details  and  equivalent  definitions  of  viscosity 
solutions,  see  Crandall  and  Lions  [24], 

Hamilton- Jacobi  equations  are  actually  easier  to  solve  than  conservation 
laws,  because  the  solutions  are  typically  continuous  (only  the  derivatives  are 
discontinuous). 
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As  before,  given  mesh  sizes  Ax,  Ay  and  At,  we  denote  the  mesh  points  as 
( Xi,yj,tn )  =  ( iAx,jAy,nAt ).  The  numerical  approximation  to  the  viscosity 
solution  <p(xi,yj,tn )  of  (10.1)  at  the  mesh  point  ( Xi,yj,tn )  is  denoted  by  <j&. 
We  again  use  a  semi-discrete  (discrete  in  the  spatial  variables  only)  formu¬ 
lation  as  a  middle  step  in  designing  algorithms.  In  such  cases,  the  numerical 
approximation  to  the  viscosity  solution  <f>(xi,yj,t)  of  (10.1)  at  the  mesh  point 
( Xi,yj,t )  is  denoted  by  the  temporal  variable  t  is  not  discretized.  We 

will  also  use  the  notations  D±<j>ij  =  anc]  Dy±<j>ij  = 

to  denote  the  first  order  forward/backward  difference  approximations  to  the 
left  and  right  derivatives  of  <f>(x,y)  at  the  location  ( Xi,yj ). 

Since  the  viscosity  solution  to  (10.1)  is  usually  only  Lipschitz  continuous 
but  not  everywhere  differentiable,  the  formal  order  of  accuracy  of  a  numerical 
scheme  is  again  defined  as  that  determined  by  the  local  truncation  error  in 
the  smooth  regions  of  the  solution.  Thus,  a  monotone  scheme  of  the  form 

w;1  =  G(<K_Ptj_r,---,<K+9lj+s)  (10.2) 

where  G  is  a  non-decreasing  function  of  each  argument,  is  called  a  first  order 
scheme,  although  the  provable  order  of  accuracy  in  the  La 0  norm  is  just  |  [25]. 
In  the  semi-discrete  formulation,  a  five  point  monotone  scheme  (it  does  not 
pay  to  use  more  points  for  a  monotone  scheme  because  the  order  of  accuracy 
of  a  monotone  scheme  is  at  most  one  [45])  is  of  the  form 

(10-3) 

The  numerical  Hamiltonian  H  is  assumed  to  be  locally  Lipschitz  continuous, 
consistent  with  H:  H{u,u,v,v)  —  H(u,v),  and  is  non-increasing  in  its  first 
and  third  arguments  and  non-decreasing  in  the  other  two.  Symbolically  H (4- 
,t,4.,t)-  It  is  easy  to  see  that,  if  the  time  derivative  in  (10.3)  is  discretized 
by  Euler  forward  differencing,  the  resulting  fully  discrete  scheme,  in  the  form 
of  (10.2),  will  be  monotone  when  At  is  suitably  small.  We  have  chosen  the 
semi-discrete  formulation  (10.3)  in  order  to  apply  suitable  nonlinearly  stable 
high  order  Runge-Kutta  type  time  discretization,  see  Sect.  9. 

Semi-discrete  or  fully  discrete  monotone  schemes  (10.3)  and  (10.2)  are 
both  convergent  towards  the  viscosity  solution  of  (10.1)  [25].  However,  mono¬ 
tone  schemes  are  at  most  first  order  accurate.  As  before,  we  will  use  the  mono¬ 
tone  schemes  as  building  blocks  for  higher  order  ENO  and  WENO  schemes. 

ENO  schemes  were  adapted  to  the  Hamilton-Jacobi  equations  (10.1)  by 
Osher  and  Sethian  [78]  and  Osher  and  Shu  [79].  As  we  know  now,  the  key 
feature  of  the  ENO  algorithm  is  an  adaptive  stencil  high  order  interpolation 
which  tries  to  avoid  shocks  or  high  gradient  regions  whenever  possible.  Since 
the  Hamilton-Jacobi  equation  (10.1)  is  closely  related  to  the  conservation  law 
(7.1),  in  fact  in  one  space  dimension  they  are  exactly  the  same  if  one  takes  u  = 
< j)x ,  it  is  not  surprising  that  successful  numerical  schemes  for  the  conservation 
laws  (7.1),  such  as  ENO  and  WENO,  can  be  applied  to  the  Hamilton-Jacobi 
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equation  (10.1).  ENO  and  WENO  schemes,  when  applied  to  Hamilton- Jacobi 
equations  (10.1),  can  produce  high  order  accuracy  in  the  smooth  regions  of  the 
solution,  and  sharp,  non-oscillatory  corners  (discontinuities  in  derivatives). 

There  are  many  monotone  Hamiltonians  [25],  [78],  [79].  In  this  section  we 
mainly  discuss  the  following  two: 

1.  For  the  special  case  H(u,v)  =  f(u2,v2)  where  /  is  a  monotone  function 
of  both  arguments,  such  as  the  example  H(u,v)  =  sju2  +  v2,  we  can  use 
the  Osher-Sethian  monotone  Hamiltonian  [78]: 

Hos  (u+ ,u~  ,v+ ,v~)  =  f(u2,v2)  (10.4) 

where,  if  /  is  a  non-increasing  function  of  u2,  u2  is  implemented  by 

u 2  =  (min(«-,0))2  +  (max(u+,0))2  (10.5) 


and,  if  /  is  a  non-decreasing  function  of  u2,  u2  is  implemented  by 

u2  =  (min(u+,0))2  +  (max(u-,  0))2  (10.6) 


Similarly  for  v2.  This  Hamiltonian  is  purely  upwind  (i.e.  when  H(u,v)  is 
monotone  in  u  in  the  relevant  domain  [u~,u+]  x  [w~ ,  v+],  only  u~  or  u+ 
is  used  in  the  numerical  Hamiltonian  according  to  the  wind  direction), 
and  simple  to  program.  Whenever  applicable  it  should  be  used.  This  flux 
is  similar  to  the  Engquist-Osher  monotone  flux  (4.7)  for  the  conservation 
laws. 

2.  For  the  general  H  we  can  always  use  the  Godunov  type  Hamiltonian  [6], 
[79]: 

H  ( U  ,  U  ,  ZJ  )  —  GXt u£l(u~  ,u+)  6XtV£j(v— (10.7) 


where  the  extrema  are  defined  by 


_  j  mino^t  if  a  <  b 
CX  ~  (  maxj,<u<0  if  a  >  b 


(10.8) 


Godunov  Hamiltonian  is  obtained  by  attempting  to  solve  the  Riemann 
problem  of  the  equation  (10.1)  exactly  with  piecewise  linear  initial  con¬ 
dition  determined  by  and  v± .  It  is  in  general  not  unique,  because  in 
general  minu  max„  H(u,  v)  ^  max^min uH(u,v)  and  interchanging  the 
order  of  the  two  ext’s  in  (10.7)  can  produce  a  different  monotone  Hamil¬ 
tonian. 

Godunov  Hamiltonian  is  purely  upwind  and  is  the  least  dissipative  among 
all  monotone  Hamiltonians  [76].  However,  it  might  be  extremely  diffi¬ 
cult  to  program,  since  in  general  analytical  expressions  for  things  like 
min„  max„  H(u,v)  can  be  quite  complicated.  The  readers  will  be  con¬ 
vinced  by  doing  the  exercise  of  obtaining  the  analytical  expression  and 
programming  H°  for  the  ellipse  in  ellipse  case  in  image  processing  where 
H(u,v )  =  \Jaul  +  2buv  -I-  cv2.  For  this  case  the  Osher-Sethian  Hamilto¬ 
nian  Hos  does  not  apply. 
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We  are  now  ready  to  discuss  about  higher  order  ENO  or  WENO  schemes 
for  (10.1).  The  framework  is  quite  simple:  we  simply  replace  the  first  order 
scheme  (10.3)  by: 


jt4>ij(t)  =  -H  (u+.  (t) ,  Uij  (t) ,  v±  (t) ,  v{j  (f ))  (10.9) 

where  uf,  (t)  are  high  order  approximations  to  the  left  and  right  ^-derivatives 
of  <f>(x,y,t )  at  0 Xi,yj,t): 


=  ^(xf,yj,t)  +  0(Axr)  (10.10) 

Similarly  for  v^(t).  Notice  that  there  is  no  cell-averaged  version  now. 

The  key  feature  of  ENO  to  avoid  numerical  oscillations  is  through  the 
following  interpolation  procedure  to  obtain  ufj(t)  and  vfj(t).  These  are  just 
the  same  ENO  procedure  we  discussed  before  in  Sect.  3.  We  repeat  it  here 
with  its  own  notations: 


ENO  Interpolation  Algorithm:  Given  point  values  f(xj),  j  =  0,  ±1,  ±2,  •  • 
of  a  (usually  piecewise  smooth)  function  f(x)  at  discrete  nodes  Xj,  we  asso¬ 
ciate  an  r-th  degree  polynomial  with  each  interval  [xj,Xj+ 1],  with 

the  left-most  point  in  the  stencil  as  x,  m  ,  constructed  inductively  as  follows: 

min 

(1)  1 /2 (a:)  =  f[xj]  f[xjixj+l](x  ~  xj)i  k IrJ  ~ 


J', 


(2)  If  and  Pj+1/2  (x)  are  both  defined,  then  let 

*min  *min  '  *min  *min  ^ 


and 


(i)  If  |a^|  >  |^|,  then  =  b®  and  kr^in  =  k^*—  1;  otherwise  c®  = 

A  1.(0  _  I.(i_ !). 

a  * min  07 min  ’ 

(11)  p/+i/2(a;)  =  piimix) + c(,)  n^-Vr1^  -  xi)  ■ 


In  the  above  procedure  /[•,•••,•]  are  the  standard  Newton  divided  differ¬ 
ences,  inductively  defined  as  f[x i,X2,  •  •  •  ,a;*+i]  =  with 

f[x  1]  =  f{x  1). 

ENO  Interpolation  Algorithm  starts  with  a  first  degree  polynomial  Pj+i/2  ix) 
interpolating  the  function  f{x)  at  the  two  grid  points  xj  and  Xj+ 1 .  If  we  stop 
here,  we  would  obtain  the  first  order  monotone  scheme.  When  higher  order  is 
desired,  we  will  in  each  step  add  just  one  point  to  the  existing  stencil,  chosen 
from  the  two  immediate  neighbors  by  the  size  of  the  two  relevant  divided 
differences,  which  measures  the  local  smoothness  of  the  function  f{x). 
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The  approximations  to  the  left  and  right  ^-derivatives  of  <f>  are  then  taken 
as 

«S  =  ,(*<)'  (1011) 

where  Pf±i/2  j(x)  is  obtained  by  the  ENO  Interpolation  Algorithm  in  the  re¬ 
direction,  with  y  =  yj  and  t  both  fixed,  are  obtained  in  a  similar  fashion. 
The  resulting  ODE  (10.9)  is  then  discretized  by  an  r-th  order  TVD  Runge- 
Kutta  time  discretization  in  Sect.  9  to  guarantee  nonlinear  stability.  More 
specifically,  the  high  order  Runge-Kutta  method  we  use  in  Sect.  9  will  main¬ 
tain  TVD  (total-variation-diminishing)  or  other  stability  properties,  if  these 
properties  are  valid  for  the  simple  first  order  Euler  forward  time  discretiza¬ 
tion  of  the  ODE  (10.9).  Notice  that  this  is  different  from  the  usual  linear 
stability  requirement  for  the  ODE  solver.  We  thus  obtain  both  nonlinear  sta-. 
bility  and  high  order  accuracy  in  time.  The  second  order  ( r  =  2)  and  third 
order  (r  =  3)  methods  we  use  which  has  this  stability  property  are  given  by 
(9.9)  and  (9.10),  respectively. 

Time  step  restriction  is  taken  as 

At  ( ~r~  max  ~H(u,v)  + max  -^~H(u,v)\  <0.6 
\Ax  u,v  du  v  ’  Ay  u,v  dv  )  ~ 

where  the  maximum  is  taken  over  the  relevant  ranges  of  u,  v.  Here  0.6  is 
just  a  convenient  number  used  in  practice.  This  number  should  be  chosen 
between  0.5  and  0.7  according  to  our  numerical  experience. 

WENO  schemes  can  be  used  in  a  similar  fashion  for  Hamilton- Jacobi 
equations  [57].  We  will  not  present  the  details  here. 

11  Applications  to  Compressible  Gas  Dynamics  I: 
Structured  Mesh  for  Polytropic  Gas 

One  of  the  main  application  areas  of  ENO  and  WENO  schemes  is  compress¬ 
ible  gas  dynamics.  In  this  section  we  describe  the  applications  of  ENO  and 
WENO  schemes  in  structured  mesh  for  poly  tropic  gas  dynamics. 

In  3D,  the  Euler  equations  of  a  poly  tropic  gas  are  written  as 

Ut  +  f(U)x  +  g(U)y  +  h(U)z  =  0  (11.1) 

U  =  ( p ,  pu,pv,pw,E), 
f(U)  =  ( pu ,  pu2  +  P,  puv,  puw,  u(E  +  P)) , 
g(U)  =  (pv,  puv ,  pv2  +  P,  pvw,  v{E  +  P)), 
h(U)  =  (pw,  puw,  pvw,  pw2  +  P,w(E  +  P)). 


where 
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Here  p  is  density,  ( u,v,w )  is  the  velocity,  E  is  the  total  energy,  P  is  the 
pressure,  related  to  the  total  energy  E  by 

E  =  P  +  ^p(u2  +  v2  +  w2) 

7  —  12 

with  7  =  1.4  for  air. 

In  two  space  dimensions,  there  is  one  fewer  equation  with  the  w  compo¬ 
nent  of  the  velocity  eliminated;  in  one  space  dimension,  there  are  two  fewer 
equations  with  the  v  and  w  components  of  the  velocity  eliminated. 

For  the  form  of  the  Navier-Stokes  equations,  for  the  eigenvalues  and  eigen¬ 
vectors  needed  for  the  characteristic- wise  ENO  and  WENO  schemes,  and  for 
those  equations  appearing  in  curvilinear  coordinates,  see,  e.g.  [91]. 


Example  11.1.  Shock  tube  problem.  This  is  a  standard  problem  for 
testing  codes  for  one  dimensional  shock  calculations.  However,  it  is  not  the 
best  test  case  for  high  order  methods,  as  the  solution  structure  is  relatively 
simple  (basically  piecewise  linear).  The  set-up  is  a  Riemann  type  initial  data: 


Ul  if  £  <  0 
Ur  if  x  >  0 


The  two  standard  test  cases  are  the  Sod’s  problem  [95]: 

(pl,Ql,Pl)  =  (1, 0, 1);  (11.2) 

( Pr ,  qR,  Pr)  =  (0.125, 0, 0.1) 

and  the  Lax’s  problem  [64]: 

(pL,  Ql,  Pl)  =  (0.445, 0.698, 3.528);  (11.3) 

(pr,9r,Pr)  =  (0.5,0,0.571) 


We  show  the  results  of  the  finite  difference  WENO  (third  order  and  fifth 
order)  schemes  for  the  Lax  problem,  in  Fig.  11.1.  Notice  that  “PS”  in  the 
pictures  means  a  way  of  treating  the  system  cheaper  than  the  local  charac¬ 
teristic  decompositions  (for  details,  see  [55]).  “A”  stands  for  Yang’s  artificial 
compression  [105]  applied  to  these  cases  [55]. 

We  can  see  from  Fig.  11.1  that  WENO  perform  reasonably  well  for  these 
shock  tube  problem.  The  contact  discontinuity  is  smeared  more  than  the 
shock,  as  expected.  Artificial  compression  helps  sharpening  contacts.  For  this 
problem,  which  is  not  the  most  demanding,  the  less  expensive  “PS”  version 
of  WENO  work  quite  well. 

ENO  schemes  on  this  test  case  perform  similarly.  We  will  not  give  the 
pictures  here.  See  [90]. 


Example  11.2.  Shock  entropy  wave  interactions.  This  problem  is  very 
suitable  for  high  order  ENO  and  WENO  schemes,  because  both  shocks  and 
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Fig.  11.1.  Shock  tube,  Lax  problem,  density.  Top  left:  third  order  WENO;  Top 
right:  fifth  order  WENO;  Bottom  left:  fifth  order  WENO  with  a  “cheaper”  charac¬ 
teristic  decomposition;  Bottom  right:  fifth  order  WENO  with  artificial  compression. 
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complicated  smooth  flow  feature  co-exist.  In  this  example,  a  moving  shock 
interacting  with  an  entropy  wave  of  small  amplitude.  On  a  domain  [0, 5],  the 
initial  condition  is: 

p  =  3.85714;  u  =  2.629369;  P  =  10.33333; 
when  x  <  0.5,  and 

p  =  e-eSin(kx).  u  =  0;  p  =  1. 

when  x  >  0.5,  where  e  and  k  are  the  amplitude  and  wave  number  of  the 
entropy  wave,  respectively.  The  mean  flow  is  a  pure  right  moving  Mach  3 
shock.  If  e  is  small  compared  to  the  shock  strength,  the  shock  will  march 
to  the  right  at  approximately  the  non-perturbed  shock  speed  and  generate 
a  sound  wave  which  travels  along  with  the  flow  behind  the  shock.  At  the 
same  time,  the  perturbing  entropy  wave,  after  “going  through”  the  shock,  is 
compressed  and  amplified  and  travels  approximately  at  the  speed  of  u  +  c 
where  u  and  c  are  the  velocity  and  speed  of  the  sound  of  the  mean  flow  left 
to  the  shock.  The  amplification  factor  for  the  entropy  wave  can  be  obtained 
by  linear  analysis. 

Since  the  entropy  wave  here  is  set  to  be  very  weak  relative  to  the  shock, 
any  numerical  oscillation  might  pollute  the  generated  waves  (e.g.  the  sound 
waves)  and  the  amplified  entropy  waves.  In  our  tests,  we  take  e  =  0.01  and 
k  =  13.  The  amplitude  of  the  amplified  entrdpy  waves  predicted  by  the  linear 
analysis  is  0.08690716  (shown  in  the  following  figures  as  horizontal  solid  lines). 

In  Fig.  11.2,  we  show  the  result  (entropy)  when  12  waves  have  passed 
through  the  shock.  It  is  clear  that  a  lower  order  method  (more  dissipative) 
damps  the  magnitude  of  the  transmitted  wave  more  seriously,  especially  when 
the  waves  are  traveling  more  and  more  away  from  the  shock.  We  can  see  that, 
while  fifth  order  WENO  with  800  points  already  resolves  the  passing  waves 
well,  and  with  1200  points  resolves  the  waves  excellently,  a  second  order 
TVD  scheme  (which  is  a  good  one  among  second  order  schemes)  with  2000 
points  still  shows  excessive  dissipation  downstream.  If  we  agree  that  fifth 
order  WENO  with  800  points  behaves  similarly  as  second  order  TVD  with 
2000  points,  then  there  is  a  saving  of  a  factor  of  2.5  in  grid  points.  This 
factor  is  per  dimension ,  hence  for  a  3D  time  dependent  problem  the  saving 
of  the  number  of  space-time  grids  will  be  a  factor  of  2.54  ~  40,  a  significant 
saving  even  after  factoring  in  the  extra  cost  per  grid  point  for  the  higher 
order  WENO  method. 

ENO  schemes  behave  similarly  for  this  problem. 

There  is  a  two  dimensional  version  of  this  problem,  when  the  entropy 
wave  can  make  an  angle  with  the  shock.  The  simulation  results  again  show 
an  advantage  in  using  a  higher  order  method,  in  Fig.  11.3.  Several  curves 
are  clustered  in  Fig.  11.3  around  the  exact  solution,  belonging  to  various 
fourth  and  fifth  order  ENO  or  WENO  schemes.  The  circles  correspond  to  a 
second  order  TVD  scheme,  which  dissipates  the  amplitude  of  the  transmitted 
entropy  wave  much  more  rapidly. 
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ENTROPY  WENO-LF-5/RK-4  N=800 


ENTROPY  WENO-LF-5  /  RK-  4  N=1200 


ENTROPY  LAX-LIU/RK-2  N=2000 


Fig.  11.2.  ID  shock  entropy  wave  interaction.  Entropy.  Top:  fifth  order  WENO 
with  800  points;  middle:  fifth  order  WENO  with  1200  points;  bottom:  second  order 
TVD  with  2000  points. 


Fig.  11.3.  2D  shock  entropy  wave  interaction.  Amplitude  of  amplified  entropy 
waves.  800  points  (about  20  points  per  entropy  wave  length). 
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Example  11.3.  Steady  state  calculations.  This  is  important  both  in  gas 
dynamics  and  in  other  fields  of  applications,  such  as  in  semiconductor  device 
simulation.  For  ENO  or  TVD  schemes,  the  residue  does  not  settle  down  to 
machine  zero  during  the  time  evolution.  It  will  decay  first  and  then  hang  at 
the  level  of  the  local  truncation  errors.  Presumably  this  is  due  to  the  fact  that 
the  numerical  flux  is  not  smooth  enough  (it  is  only  Lipschitz  continuous  but 
not  C1).  Although  this  is  not  satisfactory,  it  does  not  seem  to  affect  the  final 
solution  (up  to  the  truncation  error  level,  which  is  how  accurate  the  solution 
will  be  anyway). 

WENO  schemes  are  much  better  in  getting  the  residues  to  settle  down  to 
machine  zeroes,  due  to  the  smoothness  of  their  fluxes. 

In  Fig.  11.4  we  show  the  result  of  a  one  dimensional  nozzle  calculation. 
The  residue  in  this  case  settles  down  nicely  to  machine  zeros.  Both  fourth 
and  fifth  order  WENO  results  are  shown. 


Fig.  11.4.  Density.  Steady  quasi-ID  nozzle  flow.  34  points.  Left:  fourth  order 
WENO;  Right:  fifth  order  WENO. 


Example  11.4.  Forward  facing  step  problem.  This  is  a  standard  test 
problem  for  high  resolution  schemes  [104].  However,  second  order  methods 
usually  already  work  well.  High  order  methods  might  have  some  advantage 
in  resolving  the  slip  lines.  We  refer  the  readers  to  [21]  for  an  illustration  of 
such  advantages  of  high  order  schemes. 

The  set  up  of  the  problem  is  the  following:  the  wind  tunnel  is  1  length 
unit  wide  and  3  length  units  long.  The  step  is  0.2  length  units  high  and  is 
located  0.6  length  units  from  the  left-hand  end  of  the  tunnel.  The  problem 
is  initialized  by  a  right-going  Mach  3  flow.  Reflective  boundary  conditions 
are  applied  along  the  walls  of  the  tunnel  and  in-flow  and  out-flow  boundary 
conditions  are  applied  at  the  entrance  (left-hand  end)  and  the  exit  (right- 
hand  end).  For  the  treatment  of  the  singularity  at  the  corner  of  the  step,  we 
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adopt  the  same  technique  used  in  [104],  which  is  based  on  the  assumption  of 
a  nearly  steady  flow  in  the  region  near  the  corner. 

In  Fig.  11.5  we  present  the  results  of  fifth  order  WENO  and  fourth  order 
ENO  with  242  x  79  grid  points. 


DENSITY  WENO-LF-5 


DENSITY  ENO-LF-4 


Fig.  11.5.  Flow  past  a  forward  facing  step.  Density:  242  x  79  grid  points.  Top:  fifth 
order  WENO;  bottom:  fourth  order  ENO. 


Example  11.5.  Double  Mach  reflection.  This  is  again  a  standard  test 
problem  for  high  resolution  schemes  [104].  However,  second  order  methods 
usually  again  already  work  well.  High  order  methods  have  some  advantage 
in  resolving  the  flow  below  the  Mach  stem.  We  again  refer  the  readers  to  [21] 
for  an  illustration  of  such  advantages  of  high  order  schemes. 

The  computational  domain  for  this  problem  is  chosen  to  be  [0,4]  x  [0,1], 
although  only  part  of  it,  [0,3]  x  [0, 1],  is  shown  [104],  The  reflecting  wall  lies 
at  the  bottom  of  the  computational  domain  starting  from  x  =  |.  Initially  a 
right-moving  Mach  10  shock  is  positioned  at  x  =  \,y  =  0  and  makes  a  60° 
angle  with  the  x-axis.  For  the  bottom  boundary,  the  exact  post-shock  condi¬ 
tion  is  imposed  for  the  part  from  x  =  0  to  x  =  |  and  a  reflective  boundary 
condition  is  used  for  the  rest.  At  the  top  boundary  of  our  computational 
domain,  the  flow  values  are  set  to  describe  the  exact  motion  of  the  Mach  10 
shock.  See  [104]  for  a  detailed  description  of  this  problem. 

In  Fig.  11.6  we  present  the  results  of  fifth  order  WENO  and  fourth  order 
ENO  with  480  x  119  grid  points. 

In  Fig.  11.7  we  present  the  result  of  fifth  order  WENO  with  a  more  refined 
mesh,  1920  x  479  grid  points,  and  a  “blow-up”  portion  of  the  picture  near 
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Fig.  11.6.  Double  Mach  reflection.  Density:  480  x  119  grid  points.  Top:  fifth  order 
WENO;  bottom:  fourth  order  ENO. 


the  Mach  stem.  We  can  see  the  complicated  structures  being  captured  by  the 
scheme. 

Example  11.6.  2D  shock  vortex  interactions.  High  order  methods  have 
some  advantages  in  this  case,  as  it  resolves  the  vortex  and  the  interaction 
better. 

The  model  problem  we  use  describes  the  interaction  between  a  stationary 
shock  and  a  vortex.  The  computational  domain  is  taken  to  be  [0, 2]  x  [0, 1].  A 
stationary  Mach  1.1  shock  is  positioned  at  x  =  0.5  and  normal  to  the  x-axis. 
Its  left  state  is  (p,u,v,P)  =  (1,^/7, 0, 1).  A  small  vortex  is  superposed  to 
the  flow  left  to  the  shock  and  centers  at  (xc,yc)  =  (0.25,0.5).  We  describe 
the  vortex  as  a  perturbation  to  the  velocity  (u,v),  temperature  ( T  =  ~)  and 
entropy  (S  =  In  ^-)  of  the  mean  flow  and  denote  it  by  the  tilde  values: 

u  =  ere“^1_T  ^  sin# 
v  =  —  erea(1-r  ^  cos# 
f  (7  ~  l)e2e2at1~r2) 

4cry 

5  =  0 


where  r  =  ■*-  and  r  =  >/(x  —  xc)2  +  (y  —  yc)2.  Here  e  indicates  the  strength 
of  the  vortex,  a  controls  the  decay  rate  of  the  vortex  and  rc  is  the  critical 
radius  for  which  the  vortex  has  the  maximum  strength.  In  our  tests,  we  choose 
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Fig.  11.7.  Double  Mach  reflection.  Density:  1920  x  479  grid  points.  Fifth  order 
WENO.  Top:  the  whole  region;  bottom:  the  blow-up  region  near  the  Mach  stem. 


e  =  0.3,  rc  =  0.05  and  a  =  0.204.  The  above  defined  vortex  is  a  steady  state 
solution  to  the  2D  Euler  equation. 

We  use  a  grid  of  251  x  100  which  is  uniform  in  y  but  refined  in  x  around  the 
shock.  The  upper  and  lower  boundaries  are  intentionally  set  to  be  reflective. 
The  results  (pressure  contours)  are  shown  in  Fig.  11.8  for  a  fifth  order  WENO 
with  the  cheap  “PS”  way  of  treating  characteristic  decomposition  for  the 
system. 

In  [30],  interaction  of  a  shock  with  a  longitudinal  vortex  is  also  investi¬ 
gated  by  the  ENO  method. 

Example  11.7.  2D  bow  shock.  How  does  the  finite  difference  version  of 
ENO  and  WENO  handle  non-rectangular  domain?  As  we  mentioned  before, 
as  long  as  the  domain  can  be  smoothly  transformed  to  a  rectangle,  the  schemes 
can  be  handily  applied. 

We  consider,  as  an  example,  the  problem  of  a  supersonic  flow  past  a 
cylinder.  In  the  physical  space,  a  cylinder  of  unit  radius  is  positioned  at  the 
origin  on  a  x—y  plane.  The  computational  domain  is  chosen  to  be  [0, 1]  x  [0, 1] 
on  £  —  plane.  The  mapping  between  the  computational  domain  and  the 
physical  domain  is: 


x  =  (Rx  -  ( Rx  -  1)£)  cos(0(2r?  -  1)) 
y  =  {Ry-  ( Ry  -  1)0  Sin(6>(2r?  -  1)) 


(11.4) 

(11.5) 
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Fig.  11.8.  2D  shock  vortex  interaction.  Pressure.  Fifth  order  WENO-LF-5-PS.  30 
contours.  Top  left:  t=0.05,  Top  right:  t=0.20,  Bottom:  t=0.35. 
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where  we  take  Rx  —  3,  Ry  =  6  and  6  =  ff .  Fifth  order  WENO  and  a  uniform 
mesh  of  60  x  80  in  the  computational  domain  are  used. 

The  problem  is  initialized  by  a  Mach  3  shock  moving  toward  the  cylinder 
from  the  left.  Reflective  boundary  condition  is  imposed  at  the  surface  of  the 
cylinder,  i.e.  £  =  1,  inflow  boundary  condition  is  applied  at  £  =  0  and  outflow 
boundary  condition  is  applied  at  rj  =  0, 1, 

We  present  an  illustration  of  the  mesh  in  the  physical  space  (drawing 
every  other  grid  line),  and  the  pressure  contour,  in  Fig.  11.9.  Similar  results 
are  obtained  by  the  ENO  schemes  but  are  not  shown  here. 


Fig.  11.9.  Flow  past  a  cylinder.  Left:  the  physical  grid,  Right:  pressure.  WENO- 
LF-5.  20  contours. 


Example  11.8.  Vortex  evolution.  Finally,  we  use  the  following  problem 
to  illustrate  more  clearly  the  power  of  high  order  methods.  Consider  the 
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following  idealized  problem  for  the  Euler  equations  in  2D:  the  mean  flow  is 
p  =  1,  P  =  1,  and  (u,v)  =  (1, 1)  (diagonal  flow).  We  add,  to  this  mean  flow, 
an  isentropic  vortex  (perturbations  in  (u,v)  and  the  temperature  T  =  — ,  no 
perturbation  in  the  entropy  S  =  j^): 

(Su,5v)  =  ^e°-5{1_r2)(-y,x) 


8T=  - 


(7  ~  l)e2  l— r2 

877T2  ’ 


6S  =  0, 


where  (x,  y)  =  (x  —  5,  y  -  5),  r2  =  x2  +  y2,  and  the  vortex  strength  e  =  5. 

Since  the  mean  flow  is  in  the  diagonal  direction,  the  vortex  movement  is 
not  aligned  with  the  mesh  direction. 

The  computational  domain  is  taken  as  [0,10]  x  [0,10],  extended  periodically 
in  both  directions.  This  allows  us  to  perform  long  time  simulation  without 
having  to  deal  with  a  large  domain.  As  we  will  see,  the  advantage  of  the  high 
order  methods  are  more  obvious  for  long  time  simulations. 

It  is  clear  that  the  exact  solution  of  the  Euler  equation  with  the  above 
initial  and  boundary  conditions  is  just  the  passive  convection  of  the  vortex 
with  the  mean  velocity. 

A  grid  of  802  points  is  used.  The  simulation  is  performed  until  t  =  100 
(10  periods  in  time).  As  can  be  seen  from  Fig.  11.10,  fifth  order  WENO  has 
a  much  better  resolution  than  a  second  order  TVD  scheme,  especially  for  the 
larger  time  t  =  100. 


12  Applications  to  Compressible  Gas  Dynamics  II: 
Unstructured  Mesh  for  Poly  tropic  Gas 


In  this  section  we  describe  the  application  of  the  third  and  fourth  order 
WENO  schemes  in  Sect.  6.2  and  Sect.  7.1,  [49,50]  to  the  two  dimensional 
Euler  equations  of  a  polytropic  gas  in  general  triangulations.  The  equations 
are  given  by  (11.1)  without  the  third  dimension. 

As  was  mentioned  in  Sect.  7.4,  there  are  two  ways  to  extend  the  scalar 
schemes  to  systems.  One  is  to  do  so  component  by  component.  This  is  easy 
to  implement  and  cost  effective,  and  it  seems  to  work  well  for  the  third  order 
scheme.  We  will  use  component- wise  methods  for  all  numerical  examples  with 
the  third  order  WENO  scheme  in  this  section.  Another  extension  method  is 
by  the  characteristic  decomposition.  We  will  give  a  brief  description  in  the 
following. 

Let  us  take  one  side  of  the  triangle  which  has  the  outward  unit  normal 
(nx,ny).  Let  A  be  some  average  Jacobian  at  one  quadrature  point, 


A  = 


df 

Tly 

ou  y 


dg_ 

du' 


(12.1) 


Fig.  11.10.  Vortex  evolution.  Cut  at  x  =  5.  Solid:  exact  solution;  circles:  computed 
solution.  Top:  t  =  50  (after  5  time  periods);  Bottom:  t  =  100  (after  10  time  periods). 
Left:  second  order  TVD  scheme;  Right:  fifth  order  WENO  scheme. 
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For  Euler  systems,  the  Roe’s  mean  matrix  [82]  is  used.  Denote  by  R  the  matrix 
of  right  eigenvectors  and  L  the  matrix  of  left  eigenvectors  of  A.  Then  the 
scalar  triangular  WENO  scheme  can  be  applied  to  each  of  the  characteristic 
fields,  i.e.  to  each  component  of  the  vector  v  =  Lu.  With  the  reconstructed 
point  values  v,  we  define  our  reconstructed  point  values  u  by  u  =  Rv. 

Example  12.1.  Vortex  evolution.  This  is  the  same  test  case  as  in  Example 

11.8. 

The  reconstruction  procedure  is  applied  to  each  component  of  the  solution 
U .  We  first  compute  the  solution  to  t  =  2.0  for  the  accuracy  test.  The  meshes 
are  the  same  as  those  used  in  the  accuracy  tests  in  Sect.  7.1  for  the  scalar 
linear  and  Burgers  equations,  suitably  scaled  for  the  new  spatial  domain.  The 
accuracy  results  for  the  linear  schemes  are  shown  in  Table  12.1  for  the  uniform 
meshes  and  Table  12.2  for  the  non-uniform  meshes.  The  errors  presented  are 
those  of  the  cell  averages  of  p.  The  accuracy  results  for  the  WENO  schemes 
are  shown  in  Table  12.3  for  the  uniform  meshes  and  Table  12.4  for  the  non- 
uniform  meshes. 


Table  12.1.  Accuracy  for  2D  Euler  equation  of  smooth  vortex  evolution,  uniform 
meshes,  linear  schemes. 


Pl  (3rd  order) 

P2  (4th  order) 

■E3 

L 1  error 

L°°  error 

Lr  error 

L°°  error 

i 

1.65E-02 

— 

2.60E-01 

— 

5.26E-03 

— 

7.89E-02 

— 

H 

6.31E-03 

1.39 

1.21E-01 

1.10 

7.36E-04 

2.84 

1.62E-02 

2.28 

■ m 

1.31E-03 

2.27 

2.53E-02 

2.26 

5.40E-05 

3.77 

1.03E-03 

3.98 

1/8 

2.21E-04 

2.57 

4.66E-03 

2.44 

2.32E-06 

4.54 

5.36E-05 

4.26 

EHE 

2.98E-05 

2.89 

6.44E-04 

2.86 

1.10E-07 

4.43 

EE2 

3.77E-06 

2.98 

8.23E-05 

2.97 

6.37E-09 

4.11 

1.25E-07 

4.31 

We  then  fix  the  mesh  at  h  =  |  (uniform)  and  compute  the  long  time 
evolution  of  the  vortex.  Fig.  12.1  is  the  result  by  the  third  order  scheme  at 
t  =  0  and  after  1,  5  and  10  time  periods,  and  Fig.  12.2  is  the  result  by  the 
fourth  order  scheme.  We  show  the  line  cut  through  the  center  of  the  vortex 
for  the  density  p.  It  is  easy  to  see  the  difference  between  the  third  and  fourth 
order  schemes.  The  fourth  order  scheme  gives  almost  no  dissipation  even  after 
10  periods,  while  the  dissipation  is  quite  noticeable  for  the  long  time  results 
of  the  third  order  scheme. 
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Table  12.2.  Accuracy  for  2D  Euler  equation  of  smooth  vortex  evolution,  non- 
uniform  meshes,  linear  schemes. 


P1  (3rd  order) 

P 1  (4th  order) 

h 

LA  error 

order 

L°°  error 

JU3 

LL  error 

L°°  error 

ho/2 

1.81E-02 

— 

2.98E-01 

— 

7.00E-03 

— 

8.16E-02 

ho/4 

7.74E-03 

1.28 

1.44E-01 

1.05 

1.18E-03 

2.57 

1.61E-02 

2.34 

O 

00 

1.67E-03 

2.21 

2.47E-02 

2.54 

8.17E-05 

3.85 

1.31E-03 

3.62 

QSEQm 

2.55 

4.79E-03 

2.37 

4.70E-06 

4.12 

1.10E-04 

3.57 

fMM 

3.94E-05 

2.86 

7.95E-04 

2.59 

2.68E-07 

4.13 

7.73E-06 

3.83 

ho/64 

5.07E-06 

2.96 

1.25E-04 

2.67 

1.56E-08 

4.10 

5.99E-07 

ggiEl 

Table  12.3.  Accuracy  for  2D  Euler  equation  of  smooth  vortex  evolution,  uniform 
meshes,  WENO  schemes. 


PA  (3rd  order) 

P 1  (4th  order) 

mu 

LL  error 

023 

L°°  error 

^52 

L1  error 

L°°  error 

1.87E-02 

— ■ 

2.95E-01 

1.30E-02 

— 

2.05E-01 

— 

wm 

1.01E-02 

BflEEl 

EtiEIagiTl 

fcUhlll 

2.50E-03 

2.38 

4.45E-02 

2.49 

m 

2.78E-03 

1.86 

6.37E-02 

1.71 

1.79E-04 

■Mill 

3.29E-03 

3.76 

-m 

6.47E-04 

2.10 

3.05E-02 

1.06 

6.92E-06 

4.69 

1.96E-04 

4.07 

\ m 

8.74E-05 

2.89 

8.14E-03 

1.91 

2.03E-07 

5.09 

4.95E-06 

5.31 

EKES 

7.10E-06 

3.62 

5.66E-04 

7.83E-09 

4.70 

1.96E-07 

4.66 

Table  12.4.  Accuracy  for  2D  Euler  equation  of  smooth  vortex  evolution,  non- 
uniform  meshes,  WENO  schemes. 


PA  (3rd  order) 

J  P 2  (4th  order)  ; 

h 

L1  error 

L°°  error 

TO  ETC 

L 1  error 

L°°  error 

2522] 

ho/2 

2.12E-02 

— 

3.33E-01 

— 

1.84E-02 

— 

2.14E-01 

ho/4: 

1.28E-02 

0.73 

2.27E-01 

U23 

2.80E-03 

2.69 

3.43E-02 

2.64 

ho/8 

3.84E-03 

1.74 

6.85E-02 

1.73 

2.12E-04 

3.72 

6.57E-03 

2.38 

ho/16 

8.32E-04 

2.21 

3.02E-02 

1.18 

1.09E-05 

4.28 

5.91E-04 

3.48 

ho/32 

1.26E-04 

2.72 

5.64E-03 

2.42! 

3.76E-07 

4.86 

1.97E-05 

4.91 

ho/64 

1.16E-05 

3.44 

6.19E-04 

3.19 

1.66E-08 

4.50 

6.78E-07 

4.86 
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Fig.  12.1.  2D  vortex  evolution:  third  order  schemes.  Left:  linear  scheme;  Right: 
WENO  scheme. 


Fig.  12.2.  2D  vortex  evolution:  fourth  order  schemes.  Left:  linear  scheme;  Right: 
WENO  scheme. 
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Example  12.2.  Shock  tube  problem.  This  is  the  same  test  case  as  in 
Example  11.1,  except  that  we  compute  the  problem  in  two  dimensions.  We 
consider  the  solution  of  the  Euler  equations  in  a  domain  of  [—1, 1]  x  [0,0.2] 
with  a  triangulation  of  101  vertices  in  the  ^-direction  and  11  vertices  in 
the  ^-direction.  The  velocity  in  the  y-direction  is  zero,  and  periodic  bound¬ 
ary  condition  is  used  in  the  ^-direction.  A  portion  of  the  mesh  is  shown  in 
Fig.  12.3.  The  pictures  shown  below  are  obtained  by  extracting  the  data  along 
the  central  cut  line  for  101  equally  spaced  points. 


Fig.  12.3.  A  portion  of  the  mesh  for  the  Riemann  problems. 


The  first  test  case  is  Sod’s  problem  (11.2).  Density  at  t  =  0.40  is  shown 
in  Fig.  12.4,  left. 

The  second  test  case  is  the  Riemann  problem  proposed  by  Lax  (11.3). 
Density  at  t  =  0.26  is  shown  in  Fig.  12.4,  right. 

We  can  observe  a  better  resolution  of  the  fourth  order  scheme  over  the 
third  order  one,  and  also  a  less  oscillatory  result  from  the  characteristic  ver¬ 
sion  of  the  fourth  order  scheme  over  the  component  version. 

Example  12.3.  Forward  facing  step  problem.  This  is  the  same  test  case 
as  in  Example  11.4.  However,  for  the  corner  singularity,  instead  of  adopting 
the  same  technique  used  in  [104]  and  in  Example  11.4,  which  is  based  on  the 
assumption  of  a  nearly  steady  flow  in  the  region  near  the  corner,  we  do  not 
modify  our  method  near  the  corner,  instead  we  adopt  the  same  technique  as 
the  one  used  in  [21],  namely  refining  the  mesh  near  the  corner  and  using  the 
same  scheme  in  the  whole  domain. 

We  use  the  third  order  scheme  for  this  problem.  Four  meshes  have  been 
used,  see  Fig.  12.5.  For  the  first  mesh,  the  triangle  size  away  from  the  corner 
is  roughly  equal  to  a  rectangular  element  case  of  Ax  =  Ay  =  while  it  is 
one-quarter  of  that  near  the  corner.  For  the  second  mesh,  the  triangle  size 
away  from  the  corner  is  the  same  as  in  the  first  mesh,  but  it  is  one-eighth  of 
that  near  the  corner.  The  third  mesh  has  a  triangle  size  of  Ax  =  Ay  =  i 
away  from  the  corner,  and  it  is  one-quarter  of  that  near  the  corner.  The  last 
mesh  has  a  triangle  size  of  Ax  =  Ay  =  away  from  the  corner,  and  it 
is  one-half  of  that  near  the  corner.  Fig.  12.6  is  the  contour  picture  for  the 
density  at  time  t  =  4.0.  It  is  clear  that  with  more  triangles  near  the  corner 
the  artifacts  from  the  singularity  decrease  significantly. 


DENSITY  DENSITY  DENSITY 
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Fig.  12.4.  Riemann  problems  of  Euler  equations.  Density.  Left:  Sod’s  problem; 
Right:  Lax’s  problem.  Top:  third  order  componentwise  WENO;  Middle:  fourth  order 
componentwise  WENO;  Bottom:  fourth  order  characteristicwise  WENO. 
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Fig.  12.5.  Triangulations  for  the  forward  step  problem:  part  near  the  corner. 
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DENSITY:  3rd  order,  triangulation  1 


DENSITY:  3rd  order,  triangulation  2 


Fig.  12.6.  Forward  step  problem:  30  contours  from  0.32  to  6.15 
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Example  12.4.  Double  Mach  reflection.  This  is  the  same  test  case  as  in 
Example  11.5. 

We  test  both  the  third  and  the  fourth  order  schemes.  Four  triangle  sizes 
are  used,  they  are  roughly  equal  to  rectangular  element  cases  of  Ax  =  Ay  = 
Ax  =  Ay  =  ,  Ax  =  Ay  =  and  Ax  =  Ay  =  ^  respectively. 

For  the  third  order  scheme,  we  use  both  uniform  triangular  mesh  (equilateral 
triangles)  and  locally  refined  triangular  mesh  (the  refined  region  has  the 
above  triangle  sizes,  Fig.  12.7  shows  the  region  [0, 2]  x  [0, 1]  of  such  a  mesh 
of  Ax  =  Ay  =  T  locally).  For  the  fourth  order,  we  use  uniform  triangular 
mesh  only.  For  the  cases  of  Ax  —  Ay  --  ^  and  Ax  —  Ay  —  ^ .  we  present 
both  the  picture  of  whole  region  ([0,3]  x  [0, 1])  and  a  blow-up  region  around 
the  double  Mach  stems.  All  pictures  are  the  density  contours  with  30  equally 
spaced  contour  lines  from  1.5  to  21.5.  We  can  clearly  see  that  the  fourth  order 
scheme  captures  the  complicated  flow  structure  under  the  triple  Mach  stem 
much  better  than  the  third  order  scheme.  We  refer  to  [21]  for  similar  results 
obtained  with  discontinuous  Galerkin  methods. 


Fig.  12.7.  Triangulation  for  the  double  Mach  reflection. 


13  Applications  to  Compressible  Gas  Dynamics  III: 
Structured  Mesh  for  Real  Gas 


In  this  section  we  describe  the  application  of  the  fifth  order  WENO  scheme 
on  a  structured  mesh  in  Sect.  4.4  and  Sect.  7.4  to  solve  the  Euler  equations 
of  a  real  gas  [74]. 


Fig.  12.8.  Double  Mach  reflection:  h 


t  =  0.2. 
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Fig.  12.11.  Double  Mach  reflection:  h  =  jjjjj,  t  =  0.2  (blow-up). 
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Third  order,  h=  1/400 


Fourth  order  {componentwise),  h  =  1/400 


Fig.  12.12.  Double  Mach  reflection:  h  =  t  =  0.2. 
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We  consider  the  Euler  equations  for  a  real  compressible  inviscid  fluid, 

dtp  +  div  (pu)  =0,  t  >  0,  x  £  IRd, 
dtpu  +  div  (pu  ®u+p)  =0, 

dtE  +  div  ((E  4-  p)u)  =  0,  (13.1) 

E  =  \p\u?  +  pe, 

where  the  quantities  p,  u,  p,  E  and  e  represent  the  density,  velocity,  pressure, 
total  energy  and  specific  internal  energy,  respectively.  In  addition,  there  is 
an  equation  of  state  (EOS)  of  the  form  p  =  p(p,  e)  associated  with  a  strictly 
convex  entropy  ps(p,e)  which  satisfies  the  following  entropy  inequalities 

dtps  +  div  (psu)  <  0.  (13-2) 

The  pressure  law  is  furthermore  assumed  to  satisfy 

pAP’£)>  0,  (13.3) 

p(p,  0)  =  0  and  p(p,  oo)  =  oo. 

In  the  literature  research  has  been  done  in  order  to  extend  classical 
schemes  designed  for  perfect  gas  to  real  gases.  Collela  and  Glaz  [22]  ex¬ 
tended  the  numerical  procedure  for  obtaining  the  exact  Riemann  solution  to 
a  real-gas  case,  Grossman  and  Walters  [38],  Liou,  van  Leer  and  Shuen  [68]  ex¬ 
tended  the  method  of  flux-vector  splitting  and  flux-difference  splitting,  Mon- 
tagne,  Yee  and  Vinokur  [73]  developed  second-order  explicit  shock-capturing 
schemes  for  real  gas,  Glaister  [34]  presented  an  extension  of  approximate 
linearized  Riemann  solver  with  different  averaged  matrices,  while  Loh  and 
Liou  [71]  used  the  generalization  of  their  Lagrangian  approach  (originally 
proposed  for  perfect  gas)  to  obtain  the  real  gas  Riemann  solution. 

Most  of  the  previous  proposed  methods  would  require  a  computation  of 
the  pressure  law  and  its  derivatives,  or  a  Riemann  solver.  This  is  not  only 
costly  but  also  problematic  when  there  is  no  analytical  expressions  of  the 
pressure  law  (for  example  if  we  have  only  table  values). 

Recently  Coquel  and  Perthame  [23]  have  introduced  an  energy  relaxation 
theory  for  Euler  equations  of  real  gas.  The  main  idea  is  to  introduce  a  relax¬ 
ation  of  the  nonlinear  pressure  law  by  considering  an  energy  decomposition 
under  the  form  e  =  £i  4-  £2-  The  internal  energy  e  1  is  associated  with  a  sim¬ 
pler  pressure  law  pi  (which  is  taken  as  the  7-law  in  this  section),  while  £2 
stands  for  the  nonlinear  perturbation  and  is  simply  convected  by  the  flow. 
These  two  energies  are  also  subject  to  a  relaxation  process  and  in  the  limit 
of  an  infinite  relaxation  rate,  one  recovers  the  initial  pressure  law  p. 

From  this  general  framework,  Coquel  and  Perthame  have  also  deduced  the 
extension  to  general  pressure  laws  of  classical  schemes  for  polytropic  gases, 
which  only  uses  a  single  call  to  the  pressure  law  per  grid  point  and  time 
step.  No  derivatives  of  the  pressure  law  or  any  Riemann  solvers  need  to  be 
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computed.  Another  advantage  of  their  approach  is  that  its  implementation 
does  not  depend  on  the  particular  expression  of  the  equation  of  states.  For 
the  first  order  Godunov  scheme,  they  have  shown  that  this  extension  satisfies 
stability,  entropy  and  accuracy  conditions.  Numerical  examples  have  been 
provided  using  first  order  schemes  by  A.  In  [51]. 

The  aim  of  this  section  is  to  study  the  implementation  of  this  relaxation 
method  with  high  order  WENO  schemes  [55]  for  real  gases.  One  and  two 
dimensional  numerical  examples  will  be  given. 

In  Sect.  13.1  we  provide  the  general  framework  of  the  energy  relaxation 
theory  of  [23].  We  then  give  the  details  of  the  construction  of  the  relaxed 
WENO  schemes  for  general  gases.  In  Sect.  13.2  numerical  examples  are  given. 
We  start  with  a  description  of  the  different  equations  of  states  used  in  this 
section,  followed  by  one  dimensional  shock  tube  test  problems.  Two  dimen¬ 
sional  test  cases  of  a  smooth  vortex,  to  test  the  accuracy  of  the  schemes,  and 
of  the  double  Mach  reflection  problem,  are  then  presented. 

13.1  Implementation  of  the  Energy  Relaxation  Method  with 
WENO 

The  principle  of  the  energy  relaxation  theory  developed  by  Coquel  and  Perthame 
[23]  is  to  find  a  pressure  law  p\  (px,  ex)  (simpler  than  p,  typically  a  polytropic 
law)  and  an  internal  energy  cj)(px,£x)  so  that  the  system  (13.1)  and  the  en¬ 
tropy  inequality  (13.2)  can  be  recovered,  in  the  limit  of  an  infinite  relaxation 
rate  A  (called  the  equilibrium  limit),  from  the  following  system  (called  the 
relaxation  system ): 

dtpx  +  div  ( pxux )  =0,  t  >  0,  x  e  ntd, 
dtpxux  +  div  (pxux  ®  ux  +  px )  =  0, 

dtE +  div  (( E +  px)ux)  =  A px  (e%  -  <t>{px,e$))  ,  (13.4) 

dtpXe2  +  di v(pxuxe%)  =  -A px  (e%  -  <f>(px,£ x)) , 

1 

z^A  _  Ai-  A|2  I  A  A 

—  2  P  \u  \  ei» 

where  Pi(px,£x)  =  (71  -  l)pA£i  with  71  a  given  constant  greater  than  1. 
One  can  prove  [23]  that  the  relaxation  system  (13.4)  can  be  supplemented 
by  entropy  inequalities  under  the  form 

dtpxE  +  div(pxEux)  <  REDa  :=  -\px(Z,Slsl  ci  -  E,S2)(e2  -  <f>(px,e$)) 

where  si(p, ei)  =  p71-1/e  1  and  the  specific  entropy  E  denotes  an  arbitrary 
function  in  C1(IR2f)  such  that  pE  is  convex  in  (p,pe\,pe2)  and  that  can  be 
written  under  the  form  E  =  E(si{p,e\),e2)-  REDA  represents  the  Rate  of 
Entropy  Dissipation. 

Formally,  the  original  Euler  system  (13.1)  will  be  recovered  at  A  — >  +00 
with 

£  =  £1  +  s2  =  £1  +  <f>{p,  £1),  (13.5) 
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provided  that  we  have  the  following  condition  (called  the  consistency  condi¬ 
tion) 

p(p,£i  +  <f>{p, £\))  =pi(p,£i)  =  (71  -  l)p£i-  (13.6) 

This  last  condition  can  be  fulfilled  for  any  given  choice  of  7!  >  1. 

But  in  addition  to  the  conservative  system  (13.1) ,  one  also  wants  to  re¬ 
cover  at  the  limit  the  entropy  inequality  (13.2).  The  following  result,  due  to 
Coquel  and  Perthame  [23],  gives  this  last  condition  under  a  characterization 
of  the  admissible  71 . 

Theorem  13.1.  Assuming  that  71  satisfies 

71  >  sup p  £r(p,e),  r(p,e)  =  1  +  2f, 

7i  >  supPi£7(p,e),  7 (p,e)  =  +  Ef,  A  '  ’ 


provided  that  71  is  finite,  we  then  have 

(i)  there  exists  a  (unique)  specific  entropy  E(si,e2)  such  that  at  equilibrium 
(e  =  £1  +  <t>{p,E\)) 

s(p,£)  =  E(si  (p,£l),  <p(p,£i)), 

(ii)  this  entropy  is  uniformly  compatible  with  the  relaxation  procedure,  i.e.: 

REDa  <  0,  for  all  A  >  0. 


□ 


The  procedure  to  solve  the  Euler  system  (13.1)  within  the  framework  of 
the  energy  relaxation  theory  is  the  following.  Given  the  numerical  equilibrium 
solution  at  the  time  level  tn 

p(x,tn),u(x,tn),£(x,tn),  (13.8) 

this  approximation  is  advanced  to  the  next  time  level  tn+1  =  tn  +  At  in  two 
steps. 


-  First  step:  relaxation.  The  two  internal  energies  E\ (x,tn)  and  E2(x,tn) 
are  obtained  by  (13.5)  and  the  consistency  condition  (13.6): 


.  /_  _  p{p{x,tn),e(x,tn)) 

}  (71-I  )p(x,t»)  ’ 

£2(x,tn)  =e(x,tn)  -£i(x,tn). 


(13.9) 


Notice  that  this  step  involves  just  one  call  to  the  pressure  law  per  grid 
point  and  does  not  involve  any  derivatives  of  the  pressure  law  or  any 
iterations. 


High  Order  ENO  and  WENO  Schemes  for  CFD  549 


-  Second  step:  evolution  in  time.  For  tn  <  t  <  tn+1 ,  we  solve  the  Cauchy 
problem  for  the  relaxation  system  (13.4),  with  zero  on  the  right  side: 

dtpX  +  div  ( pxux )  =0,  t  >  0,  x  €  lRd, 

dtpxux  +  div  (pxux  ®  ux  +  p$)  =  0, 

dtEx  +  div  ({E?  +  p$)ux)  =  0,  (13.10) 

dtpXE2  +  div(pAUA£2)  =  0, 


and  the  initial  data 

p{x,  tn),u(x,  tn),£i(x,  tn),e2(x,  tn ),  (13.11) 

and  we  obtain  at  time  tn+1~ 

p(x,tn+1~),  u(x,tn+1~),  £i(a;,t"+1_),  E2{x,tn+1~).  (13.12) 

At  last,  we  compute  the  equilibrium  solution  at  time  tn+1  by 

p(x,tn+1)  =  p(x,tn+1~), 

u(x,tn+1)  =  u(x,  tn+1~),  (13.13) 

e(a;,tn+1)  =  ei(a;,tn+1_)  +e2(x,tn+1_). 


Remark  13.1.  The  first  step  is  clearly  a  relaxation  phase,  as  it  is  equivalent 
to  the  solution  of  the  following  ODE  problem  for  t>tn 


dtpx  =  0, 
dtpxux  =  0, 

dtE*  =  A px  (ex  -  <l>(p\ei))  , 
dtPXe2  =  (4  -  <£(/,£ i)) . 


(13.14) 


with  initial  data  at  time  level  tn 

p(x,  tn~),u(x,  tn~),  El  (x,  tn~),e2{x,  tn~).  (13.15) 

and  to  let  A  — ¥  -f  oo.  □ 


We  now  describe  the  numerical  method  we  will  use  for  the  step  of  evo¬ 
lution  in  time.  Although  our  numerical  results  concern  both  one  and  two 
dimensional  problems,  for  simplicity  of  presentations  we  shall  restrict  our  de¬ 
scription  to  one  space  dimension.  As  we  are  using  the  finite  difference  version 
of  WENO  schemes  in  [55],  extensions  to  two  and  more  spatial  dimensions  are 
simply  done  dimension  by  dimension.  Essentially,  the  two  dimensional  code 
is  the  one  dimensional  code  with  an  outside  “do  loop” . 

We  have  to  solve  for  tn  <t  <  tn+1  the  following  system  of  four  equations 

dtU  +  dxF{U)=  0, 

+  initial  conditions  given  by  (13.11), 


(13.16) 
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where 

U  =  (p,pu,E1,pe2f,  (13-17) 

F(U)  =  (pu,pu2  +pi,(E1  +pi)u,pue2)  . 

In  order  to  solve  the  ordinary  differential  equation 

jU  =  L(U),  (13.18) 

where  L(U)  is  a  discretization  of  the  spatial  operator,  we  use  a  third-order 

TVD  Runge-Kutta  scheme  (9.10),  [89]. 

Remark  13.2.  We  have  two  possibilities  for  the  placement  of  the  relaxation 
step:  each  Runge-Kutta  inner  stage  or  each  time  step.  With  Example  13.3 
below  we  show  that  the  two  approaches  give  nearly  identical  results  in  accu¬ 
racy.  Of  course  the  second  approach  is  less  costly.  We  thus  perform  all  our 
calculations  using  the  second  approach.  □ 


We  now  discretize  the  space  into  uniform  intervals  of  size  Ax  and  denote 
xj  =  jAx.  Various  quantities  at  Xj  will  be  identified  by  the  subscript  j. 

We  use  the  WENO  procedure  described  in  Sect.  4.1  to  obtain  the  spatial 
operator  Lj{U)  which  approximates  —dxF(U)  at  Xj.  We  have  tested  several 
possibilities  for  the  definition  of  L(U)  based  on  WENO  schemes.  The  first  one 
is  to  use  a  WENO  Lax-Friedrichs  scheme  with  a  full  characteristic  decom¬ 
position.  For  this  purpose  we  need  to  compute  a  Roe  matrix  for  the  system 
(13.16)  and  its  eigenvalues  and  eigenvectors.  The  details  of  this  derivation 
can  be  found  in  [74]. 

The  other  possibility  is  to  compute  the  first  three  components  of  the 
numerical  flux  F),  i,F2,  x  ,F?.  x  by  using  a  WENO  Lax-Friedrichs  scheme 

J~T  2  J~r  2  3'  2 

with  a  decomposition  on  the  Euler  system  characteristics  and  to  obtain  the 
last  numerical  flux  Ff  t  with  a  scalar  WENO  Lax-Friedrichs  scheme.  This  is 
possible  because  the  first  three  equations  of  system  (13.16)  are  independent 
from  the  last  one. 

Remark  13.3.  We  have  also  tried  to  compute  the  last  numerical  flux  by  using 
a  first  order  scheme  specially  designed  in  order  to  preserve  the  maximum 
principle  for  e2  [63].  But  with  this  approach,  we  lose  the  accuracy  of  the 
high-order  WENO  scheme  also  for  the  other  variables.  □ 

Remark  13.4.  In  order  to  make  comparisons  in  the  numerical  results  we  have 
also  implemented  a  WENO  Lax-Friedrichs  scheme  with  a  full  characteristic 
decomposition  for  a  two  molecular  vibrating  gas  (see  next  subsection  for  a 
description  of  the  related  EOS).  For  this  purpose  we  need  a  definition  of  the 
corresponding  Roe  average  matrix,  see  [74].  For  the  numerical  comparisons  for 
the  other  real  gases  we  use  a  component- wise  WENO  Lax-Friedrichs  scheme 
which  requires  only  the  computation  of  the  sound  velocity 


(13.19) 


c=]Jp,P+p^. 


□ 
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13.2  Numerical  Results 

We  present  here  several  equations  of  states  which  we  will  use  in  the  compu¬ 
tation.  We  find  the  second  one  in  the  paper  of  In  [51],  while  the  third  one 
comes  from  Glaister  [33]). 

•  Polytropic  ideal  gas.  The  equation  of  states  for  a  polytropic  ideal  gas  (also 
called  perfect  gas)  is  the  following 

P(P.e)  =  (7-l)pe-  (13.20) 

Then  we  have 

P,p  =  (7  -  l)e,  P,e  =  (7  -  1  )P-  (13.21) 

Air  under  normal  conditions  (jp  and  T  moderate  enough)  can  be  considered 
as  a  perfect  gas  with  7  =  7/5  =  1.4  (approximately  a  mixture  of  two  diatomic 
molecular  species:  20%  of  O2,  80%  of  1V2). 


•  Two  molecular  vibrating  gas.  When  the  temperature  increases  the  vibra¬ 
tional  motion  of  oxygen  and  nitrogen  molecules  in  air  becomes  important, 
and  specific  heats  vary  with  temperatures.  So  that  one  must  consider  the 
following  thermally  perfect,  calorically  imperfect  model  for  two  molecular 
vibrating  gas 

p(p,e)  =  rpT(e)  (13.22) 

where  the  temperature  T  is  given  by  the  implicit  expression 


pe  =  clrT  +  p 


Oi&vib 

exp  (^p)  -  1’ 


(13.23) 


with  r  =  287.086  J  ■  kg-1  ■  AT-1,  Clvr  =  r/(7tr.  -  1),  7(r  =  1.4,  6vib  =  103  K, 
a  =  r.  Then  we  have 


P,p  =  rT{e), 


rp 

P'£  ~  7(tW 


(13.24) 


•  Osborne  model  R.  K.  Osborne  from  the  Los  Alamos  Scientific  Laboratory 
has  developed  a  quite  general  equation  of  states  in  the  following  form  [81] 

p(p, e)  =  jp  T  V"  (C(°i  +  a2C)  +  E  (b0  +  C( h  +  b2Q  +  E{co  +  ciC))) 

(13.25) 

where  E  =  po£  and  (  =  //  -  1  and  the  constants  p0,  a  1,  a2,  b0,  bi,  b2, 
Co,  ci,  </>o  depend  on  the  material  in  question.  The  typical  values  for  water 
are  po  =  10“2,  oi  =  3.84  x  10-4,  a2  =  1.756  x  10-3,  bo  =  1.312  x  10“ 2 , 
61  =  6.265  x  10-2,  62  =  0.2133,  c0  =  0.5132,  a  =  0.6761  and  fa  =  2.  x  10~2. 
Then  we  have 

P,p  —  ^  +  2fl2C)  +  E  (b\  +  2&2C  +  Eci)) , 

P’e  =  ~eTToP  +  £4^  (6°  +  C(bl  +  b2°  +  2E{C°  +  Cl0) ' 
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Example  13.1.  Shock  tube  problem.  This  is  the  one  dimensional  Rie- 
mann  problem  test  case  with  perfect  gas,  already  used  in  Example  11.1.  Of 
course  for  this  perfect  gas  situation  there  is  no  need  to  use  the  relaxation 
model  in  practice.  The  purpose  of  this  test  problem  is  to  test  the  behavior  of 
different  relaxation  models  (different  71  ’s)  and  different  ways  of  treating  the 
relaxed  system  (fully  characteristic  and  partially  characteristic  for  the  first 
three  equations  only). 

For  this  example,  a  uniform  grid  of  100  points  are  used  and  every  2  points 
are  drawn  in  the  figures. 

We  first  give,  in  Table  13.1,  a  CPU  time  comparison  among  the  traditional 
WENO  characteristic  scheme  for  the  perfect  gas,  and  the  WENO  scheme  ap¬ 
plied  to  the  relaxation  system,  both  with  a  fully  characteristic  decomposition 
and  with  a  partially  characteristic  decomposition  for  the  first  three  equa¬ 
tions  only.  The  calculation  is  done  on  a  SUN  Ultral  workstation.  We  can  see 
that  while  a  fully  characteristic  decomposition  is  significantly  more  costly, 
the  partially  characteristic  decomposition  is  only  slightly  more  costly  than 
the  WENO  scheme  applied  to  the  original  perfect  gas  Euler  equations. 


Table  13.1.  CPU  time  (in  seconds)  of  different  schemes  for  the  Sod  and  Lax  shock 
tube  problems  for  a  perfect  gas. 


Case 

WENO 

with  characteristic 

Relaxed  WENO  with  Relaxed  WENO  with 
full  characteristic  partial  characteristic 

Sod  Shock 

2.28 

3.49 

2.91 

Lax  Shock 

3.32 

4.93 

4.08 

In  Figures  13.1  and  13.3,  we  present  the  comparison  for  the  Sod’s  and 
Lax’s  shock  tube  problems,  of  the  fifth  order  WENO  schemes,  applied  directly 
to  the  perfect  gas  Euler  equations  using  a  characteristic  decomposition,  and 
applied  to  the  relaxation  model  with  71  =  3  using  only  partial  characteristic 
decomposition  of  the  first  3  equations.  We  can  see  that  the  results  are  very 
close,  except  for  the  slight  over-  and  under-shoots  in  entropy  for  the  relaxation 
model  calculation.  This  indicates  the  feasibility  of  using  the  relaxation  model. 

In  Figures  13.2  and  13.4,  we  present  the  comparison  for  the  Sod’s  and 
Lax’s  shock  tube  problems,  of  the  fifth  order  WENO  schemes.  The  top  left 
figure  compares  the  full  characteristic  decomposition  for  the  relaxation  model, 
with  a  partial  characteristic  decomposition  for  the  first  3  equations  only,  for 
71  =  3.  We  can  see  that  the  results  are  quite  close,  again  indicating  the  fea¬ 
sibility  of  using  the  less  costly  partial  characteristic  decomposition  for  the 
relaxation  model.  The  top  right  figure  compares  the  effect  of  different  71  ’s  in 
the  relaxation  model.  Apparently  bigger  71  corresponds  to  larger  numerical 
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Fig.  13.1.  Sod’s  shock  tube  problem  with  WENO-LF-5  characteristic  and  relaxed 
WENO-LF-5  partial  characteristic  with  71  =  3.0.  Top  left:  density;  Top  right: 
velocity;  Bottom  left:  pressure;  Bottom  right:  entropy. 
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dissipation.  This  indicates  that  one  should  always  choose  the  smallest  pos¬ 
sible  71  subject  to  stability  considerations.  The  bottom  figure  compares  the 
relaxation  WENO  results  for  71  =  3  and  a  partial  characteristic  decomposi¬ 
tion,  with  a  component- wise  WENO  scheme  applied  directly  on  the  original 
perfect  gas  Euler  equations.  Although  neither  uses  the  correct  characteris¬ 
tic  information,  apparently  the  relaxation  model  results  are  better  than  the 
component-wise  results,  especially  for  the  Lax’s  problem  in  Figure  13.4. 


(*) 


Fig.  13.2.  Sod’s  shock  tube  problem  with  WENO-LF-5.  Comparisons  of  partial  and 
full  characteristic  decompositions  for  the  relaxation  model  with  71  =  3  (top,  left); 
71  =  3  and  71  =  30  for  the  relaxation  model  with  partial  characteristic  decomposi¬ 
tion  (top,  right);  and  the  relaxation  model  with  partial  characteristic  decomposition 
with  71  =  3  versus  the  component- wise  WENO  applied  to  the  original  perfect  gas 
Euler  equations  (bottom). 


Example  13.2.  Shock  tube  problem  for  real  gas.  In  this  example  we 
compute  the  solutions  to  the  Riemann  shock  tube  problem,  for  the  two  molec- 
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Fig.  13.3.  Lax’s  shock  tube  problem  with  WENO-LF-5  characteristic  and  relaxed 
WENO-LF-5  partial  characteristic  with  71  =  3.0.  Top  left:  density;  Top  right: 
velocity;  Bottom  left:  pressure;  Bottom  right:  entropy. 
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m 


Fig.  13.4.  Lax’s  shock  tube  problem  with  WENO-LF-5.  Comparisons  of  partial  and 
full  characteristic  decompositions  for  the  relaxation  model  with  71  =  3  (top,  left); 
71  =  3  and  71  =  30  for  the  relaxation  model  with  partial  characteristic  decomposi¬ 
tion  (top,  right);  and  the  relaxation  model  with  partial  characteristic  decomposition 
with  71  =  3  versus  the  component-wise  WENO  applied  to  the  original  perfect  gas 
Euler  equations  (bottom). 
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ular  vibrating  gas  (13.22)-(13.24)  and  the  Osborne  model  (13.25),  with  the 
following  initial  conditions  in  Table  13.2. 


Table  13.2.  Initial  conditions  for  the  test  cases  for  real  gases. 


Case  State 

P 

U  £ 

A 

Left 

0.066 

0.0  7.22e6 

Right 

0.030 

0.0  1.44e6 

B 

Left 

1.40 

0.0  2.22e6 

Right 

0.14 

0.0  2.24e6 

C 

Left 

1.2900 

0.0  1.95e6 

Right  0.0129 

0.0  2.75e6 

D 

Left 

1.00 

0.0  2.00e6 

Right 

0.01 

0.0  2.50e5 

E 

Left 

0.01  2200.0  1.44e5 

Right 

0.14 

0.0  4.00e5 

For  this  example,  a  uniform  grid  of  200  points  are  used  and  every  4  points 
are  drawn  in  the  figures.  Also,  the  “exact  solution”  in  the  figures  are  obtained 
with  the  best  scheme  using  2000  points. 

We  first  give  a  CPU  time  comparison  between  the  full  characteristic  de¬ 
composition  for  the  original  model  and  the  partial  characteristic  decompo¬ 
sition  using  only  the  first  three  equations  of  the  relaxation  model,  for  the 
two  molecular  vibrating  gas  model,  in  Table  13.3.  We  can  see  that  the  par¬ 
tial  characteristic  decomposition  for  the  relaxed  model  is  usually  more  than 
twice  less  costly  than  the  full  characteristic  version  for  the  original  system. 
Although  the  relaxed  model  has  one  more  equation,  it  does  not  require  the 
computation  of  the  complicated  derivatives  of  the  EOS. 

In  Figure  13.5  we  show  the  comparison  of  the  full  characteristic  decom¬ 
position  for  the  original  model  and  the  partial  characteristic  decomposition 
using  only  the  first  three  equations  of  the  relaxation  model,  for  the  two  molec¬ 
ular  vibrating  gas  model,  with  case  A  initial  condition.  The  results  are  almost 
identical,  indicating  that  the  relaxation  model  with  a  partial  characteristic 
decomposition  works  well  with  a  much  reduced  cost. 

In  Figure  13.6  we  show  the  comparison  of  the  component  WENO  scheme 
on  the  original  system,  and  the  partially  characteristic  WENO  scheme  on  the 
relaxed  system  with  71  =  2.0,  for  the  Osborne  gas  model  with  case  A  initial 
condition.  We  can  see  that  the  result  of  the  relaxed  model  is  much  better, 
especially  for  the  density.  This  indicates  that  the  relaxation  model  is  a  good 
one  for  the  computation  of  real  gases. 
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Table  13.3.  CPU  time  (in  seconds)  depending  on  full  or  partial  characteristic 
decomposition  with  a  two  vibrating  molecular  gas. 


Case  WENO  Relaxed  WENO  with 

with  characteristic  partial  characteristic 

A 

12.68 

5.21 

B 

4.8 

2.63 

C 

12.53 

4.87 

D 

15.0 

5.35 

E 

15.0 

7.84 

Fig.  13.5.  Case  A  +  two  vibrating  molecular  gas  model  with  WENO-LF-5  char¬ 
acteristic  and  relaxed  WENO-LF-5  partial  characteristic  with  71  =  1.5.  Top  left: 
density;  Top  right:  velocity;  Bottom  left:  pressure;  Bottom  right:  7  and  T. 
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Fig.  13.6.  Case  A  +  Osborne  gas  model  with  component- wise  WENO-LF-5  for  the 
original  system  and  relaxed  WENO-LF-5  partial  characteristic  with  71  =  2.0.  Top 
left:  density;  Top  right:  velocity;  Bottom:  pressure. 
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In  Figure  13.7  we  show  the  comparison  of  taking  71  =  10,  which  satisfies 
the  stability  condition  (13.7),  and  71  =  2,  which  satisfies  only  the  second 
inequality  in  the  stability  condition  (13.7),  for  the  partial  characteristic  de¬ 
composition  using  only  the  first  three  equations  of  the  relaxation  model,  and 
the  Osborne  gas  model  with  case  A  initial  condition.  We  can  see  that  the 
71  =  2  results  are  stable  and  less  dissipative,  indicating  that  in  practice  one 
does  not  always  have  to  choose  71  satisfying  both  inequalities  in  condition 
(13.7). 
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Fig.  13.7.  Case  A  +  Osborne  gas  model  with  the  relaxed  WENO-LF-5  partial 
characteristic  with  71  =  10.0  and  71  =  2.0. 


We  have  also  tested  the  same  problems  for  the  other  initial  condition  cases 
B,  C,  D  and  E.  The  results  are  mostly  similar  qualitatively  as  in  case  A.  To 
save  space  we  will  not  present  the  results  here. 

Example  13.3.  Vortex  evolution.  This  is  the  same  case  as  in  Example 
11.8,  the  purpose  here  being  to  verify  the  accuracy  of  the  relaxation  approach, 
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especially  the  placement  of  the  relaxation  steps  during  time  stepping.  The 
gas  is  ideal  but  we  still  use  the  relaxation  model. 

In  Table  13.4  we  show  the  accuracy  result  at  t  =  10  (one  time  period). 
We  can  see  that  WENO  for  the  relaxed  model  with  71  =  3  gives  a  somewhat 
larger  error  than  WENO  applied  directly  to  the  original  system,  but  the  order 
of  accuracy  is  correct.  Moreover,  to  place  the  relaxation  step  for  each  Runge- 
Kutta  inner  stage  or  just  for  each  time  step  seems  to  give  almost  identical 
results.  We  have  thus  used  the  less  costly  version  of  putting  the  relaxation 
step  for  every  time  step  in  all  the  numerical  examples  in  this  section. 


Table  13.4.  LI  error  and  order  of  accuracy  at  t  =  10  (1  period) 


Nb.  points  WENO 

error 

order 

20  x  20  1.07e-2 

40  x  40  1.06e-3 

3.3 

80  x  80  6.50e-5 

4.0 

160  x  160  2.09e-6 

4.9 

Nb.  points  Relaxed  WENO  Relaxed  WENO 
_ each  time  step  each  R-K  step 


error 

order 

error 

order 

20  x  20  1.22e-2 

1.22e-2 

40  x  40  2.16e-3 

2.5 

2.17e-3 

2.5 

80  x  80  1.77e-4 

3.6 

1.78e-4 

3.6 

160  x  160  7.57e-6 

4.6 

7.60e-6 

4.6 

Example  13.4.  Double  Mach  reflection.  First  we  present  the  results  for 
a  perfect  gas,  which  is  the  same  as  the  case  in  Example  11.5.  We  compare 
the  results  using  WENO  directly  on  the  original  system  [55],  and  using  it 
on  the  relaxed  model  with  71  =  1.5  and  71  =  3.0,  in  Fig.  13.8  for  a  mesh  of 
480  x  120  points  and  Fig.  13.9  for  a  mesh  of  960  x  240  points.  We  can  see  that 
the  relaxed  model  results  are  quite  satisfactory,  although  a  bigger  71  results 
in  some  small  oscillations. 

Next,  we  show  the  results  of  the  same  problem  with  the  two  vibrating 
molecular  gas.  The  purpose  here  is  to  show  that  the  relaxation  model  based 
algorithm  does  work,  rather  than  on  the  details  of  the  flow  with  more  physical 
models.  The  results  with  both  a  480  x  120  grid  and  a  960  x  240  grid  are  shown 
in  Fig.  13.10.  Comparing  with  the  results  in  [26],  we  can  see  that  the  main 
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Density  WENO-LF-5  charac. 


Density  relaxed  WENO-LF-5  partial  charac.,  gammal  =1 .5 


Fig.  13.8.  Double-Mach  reflection,  perfect  gas,  480  x  120  grid  points. 
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Density  WENO-LF-5  charac. 


30  contours  from  1 .4  to  20.0  Grid:960x240  t=0.2 


Density  relaxed  WENO-LF-5  partial  charac.,  gammal  =1 .5 


■  1  i  ■  ^ 

30  contours  from  1 .4  to  20.0  Grid:960x240  t=0.2 


Density  relaxed  WENO-LF-5  partial  charac.,  gammal  =3.0 


30  contours  from  1 .4  to  20.0  Grid:960x240  t=0.2 


Fig.  13.9.  Double-Mach  reflection,  perfect  gas,  960  x  240  grid  points. 
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features  such  as  the  main  shock  being  closer  to  the  bottom  boundary,  and 
the  shock  below  the  triple  point  being  bent,  are  also  observed  here. 


Density  Realgas  relaxed WENO-LF-5  partial charac., gamma1=1.S 


Density  Realgas  relaxed  WENO-LF-5  partial  charac.,  gamma1=1. 5 


Fig.  13.10.  Double-Mach  reflection,  two  vibrating  molecular  gas. 


14  Applications  to  Incompressible  Flows 

In  this  section  we  consider  numerically  solving  the  incompressible  Navier- 
Stokes  or  Euler  equations 

Uf  -{-  UUX  -h  VXly  —  p(uxx  d~  U'yy)  Px 
Vt  +  UVX  +  Wy  =  H{VXX  +  Vyy)  ~  Py  (14.1) 

UX  +  Vy  -  0 

or  their  equivalent  conservative  form 

Wf  “H  (ti  )a;  -h  =  p{uxx  d"  Uyy)  PX 

Vt  +  {uv)x  +  ( V2)y  =  p(vxx  +  Vyy)  ~  Py  (14-2) 

VjX  d"  Vy  - —  0 

where  ( u ,  v)  is  the  velocity  vector,  p  is  the  pressure,  p  >  0  for  the  Navier- 
Stokes  equations  and  p  =  0  for  the  Euler  equations,  using  ENO  and  WENO 
schemes.  We  do  not  discuss  the  issue  of  boundary  conditions  here,  thus  the 
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equation  is  defined  on  the  box  [0,  27t]  x  [0, 27r]  with  periodic  boundary  condi¬ 
tions  in  both  directions.  We  choose  two  space  dimensions  for  easy  presenta¬ 
tion,  although  our  method  is  also  applicable  for  three  space  dimensions. 

In  some  sense  equations  (14.1)  are  easier  to  solve  numerically  than  their 
compressible  counter-parts  in  the  previous  three  sections,  because  the  latter 
have  solutions  containing  possible  discontinuities  (for  example  shocks  and 
contact  discontinuities).  However,  the  solution  to  (14.1),  even  if  for  most 
cases  smooth  mathematically,  may  evolve  rather  rapidly  with  time  t  and 
may  easily  become  too  complicated  to  be  fully  resolved  on  a  feasible  grid. 
Traditional  linearly  stable  schemes,  such  as  spectral  methods  and  high-order 
central  difference  methods,  are  suitable  for  the  cases  where  the  solution  can 
be  fully  resolved,  but  typically  produce  signs  of  instability  such  as  oscillations 
when  small  scale  features  of  the  flow,  such  as  shears  and  roll-ups,  cannot  be 
adequately  resolved  on  the  computational  grid.  Although  in  principle  one 
can  always  overcome  this  difficulty  by  refining  the  grid,  today’s  computer 
capacity  seriously  restricts  the  largest  possible  grid  size. 

As  we  know,  the  high  resolution  “shock  capturing”  schemes  such  as  ENO 
and  WENO  are  based  on  the  philosophy  of  giving  up  fully  resolving  rapid 
transition  regions  or  shocks,  just  to  “capture”  them  in  a  stable  and  somehow 
globally  correct  fashion  (e.g.,  with  correct  shock  speed),  but  at  the  same  time 
to  require  a  high  resolution  for  the  smooth  part  of  the  solution.  The  success  of 
such  an  approach  for  the  conservation  laws  is  documented  by  many  examples 
in  these  lecture  notes  and  the  references.  One  example  is  the  one  and  two 
dimensional  shock  interaction  with  vorticity  or  entropy  waves  [90],  [91].  The 
shock  is  captured  sharply  and  certain  key  quantities  related  to  the  interaction 
between  the  shock  and  the  smooth  part  of  the  flow,  such  as  the  amplification 
and  generation  factors  when  a  wave  passes  through  a  shock,  are  well  resolved. 
Another  example  is  the  homogeneous  turbulence  for  compressible  Navier- 
Stokes  equations  studied  in  [91].  In  one  of  the  test  cases,  the  spectral  method 
can  resolve  all  the  scales  using  a  2562  grid,  while  third  order  ENO  with  just 
642  points  can  adequately  resolve  certain  interesting  quantities  although  it 
cannot  resolve  local  quantities  achieved  inside  the  rapid  transition  region  such 
as  the  minimum  divergence.  The  conclusion  seems  to  be  that,  when  fully 
resolving  the  flow  is  either  impossible  or  too  costly,  a  “capturing”  scheme 
such  as  ENO  can  be  used  on  a  coarse  grid  to  obtain  at  least  some  partial 
information  about  the  flow. 

We  thus  expect  that,  also  for  the  incompressible  flow,  we  can  use  high- 
order  ENO  or  WENO  schemes  on  a  coarse  grid,  without  fully  resolving  the 
flow,  but  still  get  back  some  useful  information. 

A  pioneer  work  in  applying  shock  capturing  compressible  flow  techniques 
to  incompressible  flow  is  by  Bell,  Colella  and  Glaz  [8],  in  which  they  con¬ 
sidered  a  second  order  Godunov  type  discretization,  investigated  the  projec¬ 
tion  into  divergence-free  velocity  fields  for  general  boundary  conditions,  and 
discussed  accuracy  of  time  discretizations.  Higher  order  ENO  and  WENO 
schemes  for  incompressible  flows  are  extensions  of  such  methods. 
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We  solve  (14.2)  in  its  equivalent  projection  form 


(14.3) 


where  P  is  the  Hodge  projection  into  divergence-free  fields,  i.e.,  if 

P  then  ux  +  vy  =0  and  vy  —  ux  —  vy  -  ux.  See,  e.g.,  [8].  For  the 

current  periodic  case  the  additional  condition  to  obtain  a  unique  projection 
P  is  that  the  mean  values  of  u  and  v  are  preserved,  i.e.,  /Q27r  /Q27r  u(x,  y)dxdy  = 

C  .IT  u(x,  y)dxdy  and  f2n  v(x,  y)dxdy  =  f2w  v(x,  y)dxdy. 

We  use  Nx  and  Ny  (even  numbers)  equally  spaced  grid  points  in  x  and 
y,  respectively.  The  grid  sizes  are  denoted  by  Ax  =  and  Ay  —  yS  and 
the  grid  points  are  denoted  by  xx  =  iAx  and  yj  =  jAy.  The  approximated 
numerical  values  of  u  and  v  at  the  grid  point  ( Xi,yj )  are  denoted  by  Uij  and 

Vij. 

We  first  describe  the  numerical  implementation  of  the  projection  P.  In 
the  periodic  case  this  is  easily  achieved  in  the  Fourier  space.  We  first  expand 
u  and  v  using  Fourier  collocation: 

Ny  N„. 

uN{x,  y)  =  ^2  UkieI{kx+ly) , 

l=-^Lk=-^- 

Ny 

VN{x,y)  =  ^2  vkleI(kx+ly) 

l=-ZLk=-Z*- 

where  I  =  a/-1,  Ukl  and  vti  are  the  Fourier  collocation  coefficients  which  can 
be  computed  from  the  point  values  uy  and  vy,  using  either  FFT  or  matrix- 
vector  multiplications.  The  detail  can  be  found  in,  e.g.,  [12].  Derivatives, 
either  by  spectral  method  or  by  central  differences,  involve  only  multiplica¬ 
tions  by  factors  djj:  or  dtf  in  (14.4)  because  er^kx+ly^  are  eigenfunctions  of  such 
derivative  operators.  For  example, 


dxk  =  Ik, 

II 

(14.5) 

for  spectral  derivatives; 

2/sin(M*) 

dk  ~  Ax  ’ 

‘  Ay 

(14.6) 

for  the  second  order  central  differences  which,  when  used  twice,  will  produce 
the  second  order  central  difference  approximation  for  wxx ,  and 


(14.4) 
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jy  _  2Ji/(l  -  cos(Mj/))(7  -  cos {l Ay)) 

1  ~  Ay 


(14.7) 


for  the  fourth  order  central  differences  which,  when  used  twice,  will  produce 
the  fourth  order  central  difference  approximation 

16(u)i+i+Wi  —  i)  —  (wi+2+w;-2)— 30wj 
12  Axi 

for  wxx.  High  order  filters,  such  as  the  exponential  filter  [72],  [58]: 


y  —  c  “( », 


a,  —  e 


(14.8) 


where  2 p  is  the  order  of  the  filter  and  a  is  chosen  so  that  e~“  is  machine 
zero,  can  be  used  to  enhance  the  stability  while  keeping  at  least  2p-th  or¬ 
der  of  accuracy.  This  is  especially  helpful  when  the  projection  P  is  used  for 
the  under-resolved  coarse  grid  with  ENO  methods.  We  use  the  fourth  order 
projection  (14.7)  and  the  filter  (14.8)  with  2p  —  8  in  our  calculations.  This 
will  guarantee  third  order  accuracy  (fourth  order  in  L\)  of  the  ENO  scheme. 
We  will  denote  this  combination  (the  fourth  order  projection  plus  the  eighth 

order  filtering)  by  P4.  To  be  precise,  if  =  P4  and  uki  and  vki 

are  Fourier  collocation  coefficients  of  u  and  v,  then  the  Fourier  collocation 
coefficients  of  u  and  v  are  given  by 


u  =  olo'i 


(dxk)2  +  Ky)2  ’ 


v  =  axkcr? 


-dxk{tfu-dxkv) 

K)2  +  «)2 


(14.9) 


where  <rk  and  of  are  defined  by  (14.8)  with  2 p  —  8,  and  dk  and  df  are  defined 
by  (14.7). 

Next  we  shall  describe  the  ENO  scheme  for  (14.2).  Since  (14.2)  is  equiva¬ 
lent  to  the  non-conservative  form  (14.1),  it  is  natural  to  implement  upwinding 
by  the  signs  of  u  and  v,  and  to  implement  ENO  equation  by  equation  (the 
component  version  described  in  Sect.  4.4).  The  r-th  order  ENO  approxima¬ 
tion  of,  e.g.,  (u2)x  is  thus  carried  out  using  the  ENO  Algorithm  4.2.  We 
mention  a  couple  of  facts  needing  attention: 

1.  Take  f(x)  =  u2(x,y)  with  y  fixed.  We  start  with  the  point  values  fi  = 

f(xi); 

2.  The  stencil  of  the  reconstruction  is  determined  adaptively  by  upwinding 
and  smoothness  of  f(x).  It  starts  with  either  Xj  or  a :J+1  according  to 
whether  u  >  0  or  u  <  0. 


There  are  two  ways  to  handle  the  second  derivative  terms  for  the  Navier- 
Stokes  equations.  One  can  absorb  them  into  the  convection  part  and  treat 
them  using  ENO.  For  example,  f(x)  =  u2(x,y)  can  be  replaced  by  f(x)  = 
u 2  (x,  y)  - pu(x,  y)x ,  where  u(x,y)x  itself  can  be  obtained  using  either  ENO  or 
central  difference  of  a  suitable  order.  The  remaining  procedure  for  computing 
f(x)x  would  be  the  same  as  described  above.  Another  simpler  possibility  is 
just  to  use  standard  central  differences  (of  suitable  order)  to  compute  the 
double  derivative  terms.  Our  experience  with  compressible  flow  is  that  there 
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is  little  difference  between  the  two  approaches,  especially  when  the  viscosity 
fi  is  small. 

In  the  above  we  have  described  the  discretization  for  the  spatial  deriva¬ 
tives 


y  =  yj 


(14.10) 


We  then  use  the  third  order  TVD  (total  variation  diminishing)  Runge-Kutta 
method  (9.10)  to  discretize  the  resulting  ODE: 


obtaining: 


(14.11) 


(14.12) 


Notice  that  we  have  used  the  property  P4  o  P4  =  P4  in  obtaining  the  dis¬ 
cretization  (14.12)  from  (14.11). 

This  explicit  time  discretization  is  expected  to  be  nonlinearly  stable  under 
the  CFL  condition 


At 


max 


\uij\ 

Ax 


+ 


\Vij  I  \ 

Ay  ) 


f-i 

\Ax* 


+  2M-^2+^>) 


<  1 


(14.13) 


For  small  p  (which  is  the  case  we  are  interested  in)  this  is  not  a  serious 
restriction  on  At. 

We  present  some  numerical  examples  in  the  following. 


Example  14.1.  Accuracy  test.  This  example  is  used  to  check  the  third 
order  accuracy  of  our  ENO  scheme  for  smooth  solutions.  We  first  take  the 
initial  condition  as 


u(x,  y,  0)  =  -  cos(x)  sin(j/),  (14.14) 

v(x,  y,  0)  =  sin(x)  cos  (y) 

which  was  used  in  [8].  The  exact  solution  for  this  case  is  known: 

u(x,y,t)  =  -cos(a;)  sin  (y)e~2^lt 
v(x,y,t)  =  sin(a;)cos  {y)e~2,lt 


(14.15) 
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We  take  Ax  =  Ay  =  ^  with  N  =  32, 64, 128  and  256.  The  solution  is 
computed  up  to  t  —  2  and  the  L2  error  and  numerical  order  of  accuracy 
are  listed  in  Table  14.1.  For  the  p  =  0.05  case,  we  list  results  both  with 
fourth  order  central  approximation  to  the  double  derivative  terms  (central) 
and  with  ENO  to  handle  the  double  derivative  terms  by  absorbing  them  into 
the  convection  part  (ENO).  We  can  clearly  observe  fully  third  order  accuracy 
(actually  better  in  many  cases  because  the  spatial  ENO  is  fourth  order  in  the 
L\  sense)  in  this  table. 


Table  14.1.  Accuracy  of  ENO  Schemes  for  (14.2). 


N 

1 _ M  =  0 _ 

\p  =  0.05,  ENO| 

Z/2  error 

1/2  error 

order 

L2  error 

order 

1121 

9.10(-4) 

5.28(-4) 

4.87(-4) 

IESI 

5.73(-5) 

3.99 

3.20(-5) 

4.04 

3.09(-5) 

3.98 

m\ 

3.62(-6) 

1.93(-6) 

4.05 

1.89(-6) 

4.03 

lEEU 

2.28(-7) 

1.18(-7) 

4.03 

1.16(-7) 

4.03 

Example  14.2.  Double  shear  layer.  This  is  our  test  example  to  study 
resolution  of  ENO  schemes  when  the  grid  is  coarse.  It  is  a  double  shear  layer 
taken  from  [8]: 


u(x  v  01  =  f tanh«»  ~ 

’ V'  \  tanh((37T/2  -  y)/p)  y  >  n 
v(x,  y,  0)  =  6  sin(a;)  (14.16) 


where  we  take  p  =  tt/15  and  6  —  0.05.  The  Euler  equations  (p  =  0)  are  used 
for  this  example.  The  solution  quickly  develops  into  roll-ups  with  smaller  and 
smaller  scales,  so  on  any  fixed  grid  the  full  resolution  is  lost  eventually.  For 
example,  the  expensive  run  we  performed  using  5122  points  for  the  spectral 
collocation  code  (with  a  18-th  order  filter  (14.8))  is  able  to  resolve  the  solution 
fully  up  to  t  =  8,  Fig.  14.1,  top  left,  as  verified  by  the  spectrum  of  the  solution 
(not  shown  here),  but  begins  to  lose  resolution  as  indicated  by  the  wriggles 
in  the  vorticity  contour  at  t  =  10  (not  shown  here).  On  the  other  hand, 
the  ENO  runs  with  642  (top  right)  and  1282  points  (bottom  left)  produces 
smooth,  stable  results  Fig.  14.1.  In  Fig.  14.1,  bottom  right,  we  show  a  cut 
at  x  =  7r  for  v  at  t  =  8.  This  gives  a  better  feeling  about  the  resolution  in 
physical  space.  Apparently  with  these  coarse  grids  the  full  structure  of  the 
roll-up  is  not  resolved.  However,  when  we  compute  the  total  circulation 


cn=  w(x,y)dxdy  =  /  udx  +  vdy  (14-17) 

Jn  JdQ 


570  Chi- Wang  Shu 


around  the  roll-up  by  taking  Q  =  [f ,  ^-]  x  [0,27t]  and  using  the  rectangular 
rule  (which  is  infinite  order  accurate  for  the  periodic  case)  on  the  line  integrals 
at  the  right-hand-side  of  (14.7),  we  can  see  that  this  number  is  resolved  much 
better  than  the  roll-up  itself,  Table  14.2. 


Fig.  14.1.  Double  shear  layer.  Contours  of  vorticity.  t  =  8.  Top  left:  spectral  with 
5122  points;  Top  right:  ENO  with  642  points;  Bottom  left:  ENO  with  1282  points; 
Bottom  right:  the  cut  at  x  =  tt  of  v,  spectral  method  with  5122  points,  ENO  method 
with  642  and  with  1282  points. 


Example  14.3.  Level  set  formulation  and  vortex  sheet.  As  an  appli¬ 
cation  of  ENO  scheme  for  incompressible  flow,  we  consider  the  motion  of  an 
incompressible  fluid,  in  two  and  three  dimensions,  in  which  the  vorticity  is 
concentrated  on  a  lower  dimensional  set  [40].  Prominent  examples  are  vortex 
sheets  and  vortex  filaments  in  three  dimensions,  and  vortex  sheets,  vortex 
dipole  sheets  and  point  vortices  in  two  dimensions. 

In  three  dimensions,  the  equations  are  written  in  the  form 

+  vVf  -  Vu  £  =  0 

Vxt)  =  (  (14.18) 

V  •  v  =  0 

where  £(x,y,z,t )  is  the  vorticity  vector,  and  v(x,  y,  z,  t)  is  the  velocity  vector. 
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Table  14.2.  Resolution  of  the  Total  Circulation. 


t 

2 

4 

6 

8 

10 

MSXiT&M 

0.87300 

3.07100 

7.16889 

9.88063 

10.90122 

0.87452 

2.97810 

7.30999 

10.34414 

11.79418 

0.87433 

2.98029 

7.28308 

10.46212 

11.85875 

In  a  vortex  sheet,  £  is  a  singular  measure  concentrated  on  a  two  dimen¬ 
sional  surface,  while  in  a  vortex  filament,  £  is  a  function  concentrated  on  a 
tubular  neighborhood  of  a  curve. 

We  use  an  Eulerian,  fixed  grid,  approach,  that  works  in  general  in  two  and 
three  dimensions.  In  the  particular  case  of  the  two  dimensional  vortex  sheet 
problem  in  which  the  vorticity  does  not  change  sign,  the  approach  yields  a 
very  simple  and  elegant  formulation. 

The  basic  observation  involves  a  variant  of  the  level  set  method  for  cap¬ 
turing  fronts,  developed  in  [78]. 

The  formulation  we  use  here  regularizes  general  ill-posed  problems  via  the 
level  set  approach,  using  the  idea  that  a  simple  closed  curve  which  is  the  level 
set  of  a  function  cannot  change  its  index,  i.e.  there  is  an  automatic  topological 
regularization.  This  is  very  helpful  for  numerical  calculations.  The  regular¬ 
ization  is  automatically  accomplished  through  the  use  of  dissipative  schemes, 
which  has  the  effect  of  adding  a  small  curvature  term  (which  vanishes  as 
the  grid  size  goes  to  zero)  to  the  evolution  of  the  interface.  The  formulation 
allows  for  topological  changes,  such  as  merging  of  surfaces. 

The  main  idea  is  to  decompose  £  into  a  product  of  the  form 

£  =  P(ip)v  (14.19) 

where  P  is  a  scalar  function,  typically  an  approximate  S  function.  The  vari¬ 
able  (p  is  a  scalar  function  whose  zero  level  set  represents  the  points  where 
vorticity  concentrates,  and  t)  represents  the  vorticity  strength  vector.  This 
decomposition  is  performed  at  time  zero  and  is  of  course  not  unique. 

The  observation  is  that  once  a  decomposition  is  found,  the  following  sys¬ 
tem  of  equations  yields  a  solution  to  the  Euler  equations,  replacing  the  orig¬ 
inal  set  of  equations  (14.18). 

(fit  +  vVip  —  0 

r]t  +  v’Vr]  —  S7vr)  =  0  (14.20) 

V  x  v  =  P{<p)i) 

V  •  v  =  0 

These  equations  have  initial  conditions 

¥>(0, 0  =  <Po 


572  Chi- Wang  Shu 


*7(0,  ■)  =  *?o 

where  <po,  %  and  P  are  chosen  so  that  (14.19)  holds  at  time  7  =  0.  Notice 
that  (14.19)  and  (14.20)  imply  that  Vip  is  orthogonal  to  r/,  and  div(r})  =  0. 
This  is  enforced  in  the  initial  condition  and  is  maintained  automatically  by 
(14.19)  and  (14.20). 

When  P  is  a  distribution,  such  as  a  6  function,  approaching  P  with  a 
sequence  of  smooth  modifiers  Pt  yields  a  sequence  of  approximating  solutions. 
This  is  the  approach  used  in  numerical  calculations,  since  the  6  function  can 
only  be  represented  approximately  on  a  finite  grid.  The  parameter  e  is  usually 
chosen  to  be  proportional  to  the  mesh  size. 

The  advantage  of  this  formulation,  is  that  it  replaces  a  possibly  singular 
and  unbounded  vorticity  function  £,  by  bounded,  smooth  (at  least  uniformly 
Lipschitz)  functions  ip  and  rp  Therefore,  while  it  is  not  feasible  to  compute 
solutions  of  (14.18)  directly,  it  is  very  easy  to  compute  solutions  of  (14.20). 
In  two  dimensions,  the  vorticity  is  given  by 

f=  [o  j 

and  hence  the  Euler  equations  are  given  by 

wt  +  vVu>  =  0 

curl(v)  —  u  (14.21) 

div(v )  =  0 

Our  formulation  (14.20),  becomes 

(pt  +vVip  —  0 

Tit+vVr]  =  0  (14.22) 

curl(v)  =  P{<p)r] 
div{v)  =  0 


where  is  now  a  scalar. 

If  the  vortex  sheet  strength  r]  does  not  change  sign  along  the  curve,  it  can 
be  normalized  to  t]  =  1  and  the  equations  take  on  a  particularly  simple  and 
elegant  form: 

Pt  +  v(ip)\7ip  =  0  (14.23) 

where  the  velocity  v(ip)  is  given  by 

„  =  -(-£)  A-1  Pfo)  (14.24) 

In  this  case,  the  vortex  sheet  strength  along  the  curve  is  given  by 
(see  (14.26)). 
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We  first  consider  the  periodic  vortex  sheet  in  two  dimensions,  i.e.  PUp)  = 
8(tp)  in  (14.24).  The  three  dimensional  case  is  defined  in  detail  later.  The 
evolution  of  the  vortex  sheet  in  the  Lagrangian  framework  has  been  consid¬ 
ered  by  various  authors.  Krasny  [59],  [60]  has  computed  vortex  sheet  roll-up 
using  vortex  blobs  and  point  vortices  with  filtering.  Baker  and  Shelley  [4] 
have  approximated  the  vortex  sheet  by  a  layer  of  constant  vorticity  which 
they  computed  by  Lagrangian  methods.  In  the  context  of  our  approach,  their 
approximation  corresponds  to  approximating  the  <5  function  by  a  step  func¬ 
tion. 

In  our  framework,  we  use  a  fixed  Eulerian  grid,  and  approximate  (14.23) 
by  the  third  order  upwind  ENO  finite  difference  scheme  with  a  third  order 
TVD  Runge-Kutta  time  stepping.  At  every  time  step,  the  velocity  v  is  first 
obtained  by  solving  the  Poisson  equation  for  the  stream  function  'J>: 

AV  =  - P{<p ) 


with  boundary  conditions 

tf(*,±l)  =  0 

and  periodic  in  x.  This  is  done  by  using  a  second  order  elliptic  solver  FISH- 
PAK.  Once  is  obtained,  the  velocity  is  recovered  by  v  =  (—\Py,\Px)  by 
using  either  ENO  or  central  difference  approximations  (we  do  not  observe 
major  difference  among  the  two:  the  results  shown  are  those  obtained  by 
central  difference).  Once  v  is  obtained,  upwind  biased  ENO  is  easily  applied 
to  (14.23). 

The  initial  conditions  are  similar  to  the  ones  in  [60],  i.e  given  by  a  sinu¬ 
soidal  perturbation  of  a  flat  sheet: 

<Po(x,  y)  =  y  +  0.05  sin(7ra;) 

The  boundary  condition  for  tp  are  periodic,  of  the  form: 

¥>(*,-!  ,y)  =  <p{t,i,y) 


ip(t,x,-l)  =  <p{t,x,l)  ~  2 
The  8  function  is  approximated  as  in  [80], [99]  by 


U<t>)  = 


i(l  +  cos(^))if  |p|  <e 

0  otherwise 


(14.25) 


For  fixed  e,  there  is  convergence  as  Ax  -»  0  to  a  smooth  solution.  One  can 
then  take  e  -4  0.  This  two  step  limit  is  very  costly  to  implement  numerically. 
Our  numerical  results  show  that  one  can  take  e  to  be  proportional  to  Ax , 
but  convergence  is  difficult  to  establish  theoretically. 

In  Fig.  14.2,  top  left,  we  present  the  result  at  t  =  4,  of  using  ENO  with 
1282  grid  points  with  the  parameter  e  in  the  approximate  <5  function  chosen 


574  Chi- Wang  Shu 


as  e  =  12Ax.  We  use  the  graphic  package  TECPLOT  to  draw  the  level  curve 
of  <p  =  0.  Next,  we  keep  e  =  12zlx  but  double  the  grid  points  in  each  direction 
to  2562,  the  result  of  t  =  4  is  shown  in  Fig.  14.2,  top  right.  Comparing  with 
Fig.  14.2,  top  left,  we  can  see  that  there  are  more  turns  in  the  core  at  the 
same  physical  time  when  the  grid  size  is  reduced  and  the  <5  function  width  e 
is  kept  proportional  to  Ax.  One  might  wonder  whether  the  core  structure  of 
Fig.  14.2,  top  right,  is  distorted  by  numerical  error.  To  verify  that  this  is  not 
the  case,  we  keep  e  =  12  x  ^  fixed,  and  reduce  Ax,  Fig.  14.2,  bottom 
two.  The  three  pictures  overlay  very  well,  the  bottom  two  pictures  in  Fig.  14.2 
are  indistinguishable,  indicating  that  the  core  structure  is  a  resolved  solution 
to  the  problem  and  convergence  is  obtained  with  fixed  e.  By  reducing  e  for 
the  more  refined  grids,  more  turns  in  the  core  can  be  obtained  in  shorter  time 
(pictures  not  shown). 


Fig.  14.2.  Two  dimensional  vortex  sheet  simulation,  t  —  4.  Top  left:  ENO  with 
1282  points,  5  function  width  e  =  12 Ax  =  Top  right:  ENO  with  2562  points, 
5  function  width  e  =  12 Ax  =  Bottom  left:  ENO  with  5122  points,  5  function 
width  e  =  24 Ax  =  Bottom  right;  ENO  with  10242  points,  5  function  width 
e  =  48  Ax  =  4L. 


The  smoothing  of  the  <5  function,  and  the  third  order  truncation  error  in 
the  advection  step  and  the  second  order  error  in  the  inverse  Laplacian  are 
the  only  smoothing  steps  in  our  method. 
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We  now  give  the  same  example  in  three  dimensions.  We  first  sketch  the 
algorithm  for  initializing  and  computing  a  periodic  3D  vortex  sheet,  using 
(14.20). 

We  let  P(<p)  =  8 (ip)  (in  practice  6  is  replaced  by  an  approximation).  The 
zero  level  set  of  tp  is  the  vortex  sheet  T(s),  parameterized  by  surface  area  s. 
The  variable  rjo  is  chosen  to  fit  the  initial  vortex  sheet  strength.  For  instance, 
given  any  smooth  test  function  g 

(Z,9)  =  (VoS(<Po),9) 

=  J  Vo(r0(s))g(r0(s))~^ds 


Thus,  the  initial  vortex  sheet  strength  is  given  by 

Vo 

|V<po| 


(14.26) 


To  obtain  the  velocity  vector,  one  introduces  the  vector  potential  A,  where 


v  =  V  x  A,  div(A)  =  0 


and  solves  the  Poisson  equation 

A  A  =  -P(<p)  r\  (14.27) 

To  ensure  that  div(A)  =  0,  we  require  that  div(rj)  =  0  and  that  Vp  ■  rj  =  0 
initially.  It  is  easy  to  see  that  these  equalities  are  maintained  as  t  increases. 

The  boundary  conditions  for  the  velocity  are  1*2(0;,  ±1  ,z)  =  0  and  periodic 
in  x  and  z.  To  obtain  the  boundary  conditions  for  A  =  (A\,A2,A^),  we 
use  the  divergence  free  condition  on  A  in  addition  to  the  velocity  boundary 
condition.  Thus, 


A\(x,  ±l,z)  =  ^3(0;,  ±1,  z)  =  0  (14.28) 

dyA2(x,±\,z)  =  0 

and  periodic  in  x,  z.  The  Neumann  condition  requires  the  following  compat¬ 
ibility  condition 

J z,0)dxdydz  =  0 

Three  dimensional  runs  are  much  more  expensive  than  two  dimensional 
runs,  not  only  because  the  number  of  grid  points  increases,  but  also  because 
there  are  now  four  evolution  equations  (for  ip  and  77),  and  three  potential 
equations.  We  still  use  the  third  order  ENO  scheme  coupled  with  the  sec¬ 
ond  order  elliptic  solver  FISHPAK,  with  643  grid  points,  and  e  is  chosen  as 
6 Ax,  which  is  the  same  in  magnitude  as  that  used  in  Fig.  14.2  of  the  two 
dimensional  example.  The  boundary  conditions  for  ip  are  similar  to  the  ones 


576  Chi- Wang  Shu 


in  two  dimensions:  periodic  in  all  directions  (module  the  linear  term  in  y). 
The  vortex  sheet  strength  vector  rj  is  periodic  in  all  directions. 

We  first  verify  whether  we  can  recover  the  two  dimensional  results  with 
the  three  dimensional  setting.  We  use  the  initial  condition 

ip0  {x,  y,  z)  =  y  +  0.05  sin(7rx) 

which  is  the  same  as  that  for  the  two  dimensional  example,  and  choose  a 
constant  initial  condition  for  r]  as  770(2:,  y,  z)  —  (0,0,1).  We  observe  exact 
agreement  with  our  two  dimensional  results  in  Fig.  14.2.  Next,  we  consider 
the  truly  three  dimensional  problem  with  the  initial  condition  chosen  as 

<po(x,y,z)  =  y  +  0.05  sin(7rx)  +  0.1  sin(7rz) 

and  r]  is  chosen  as  T)0(x,y,z)  =  (0,  — O.l7rcos(7r.z),  1)  which  satisfies  the  di¬ 
vergence  free  condition  as  well  as  the  condition  to  be  orthogonal  to  Vip.  In 
Fig.  14.3,  left,  we  show  the  level  set  of  ip  =  0  for  t  =  5.  We  can  clearly  see  the 
roll  up  process  and  the  three  dimensional  features.  The  cut  at  the  constants 
z  —  0  plane  is  shown  in  Fig.  14.3,  right. 


Fig.  14.3.  Three  dimensional  vortex  sheet  simulation,  t  =  5.  ENO  with  643  points. 
S  function  width  e  =  6 Ax.  Left:  three  dimensional  level  surface;  Right:  z  =  0  plane 
cut. 
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